1.數據修復最先考慮通過db內做修復,實在不行,在考慮外部應用程序通過jdbc修復.
比如一個場景:profile_image_url與enlarge_image_url都是微博用戶信息返回的字段. 前者是http://tp2.sinaimg.cn/1928431341/50/5621497131/1,後者正常情況是http: //tp2.sinaimg.cn/1928431341/180/5621497131/1, 此時如果修復後者的數據,只需將/50/替換成/180/,只需通過postgres的字符函數解決。
2.常用函數
2.1常用字符串函數列表
注意, 下頁的示例中字符串都是可以用表中的字段替代. 測試函數可以類似 select "char_length"('string'); *標識不常用, 字符串在任何庫的函數最主要的不過就是substring, position, length, replace幾種,類似於db的CRUD。
函數:string || string
說明:String concatenation 字符串連接操作
例子:'Post' || 'greSQL' = PostgreSQL
函數:string || non-string or non-string || string
說明:String concatenation with one non-string input 字符串與非字符串類型進行連接操作
例子:'Value: ' || 42 = Value: 42
函數:bit_length(string)
說明:Number of bits in string 計算字符串的位數
例子:bit_length('jose') = 32
函數:char_length(string) or character_length(string)
說明:Number of characters in string 計算字符串中字符個數
例子:char_length('jose') = 4
與length一樣
select "char_length"('string'),"length"('string'); res: 6 6
函數:lower(string)
說明:Convert string to lower case 轉換字符串為小寫
例子:select "lower"('ABC') =abc
函數:octet_length(string)
說明:Number of bytes in string 計算字符串的字節數
例子:octet_length('jose') = 4
函數:overlay(string placing string from int [for int])
說明:Replace substring 替換字符串中任意長度的子字串為新字符串
例子:overlay('Txxxxas' placing 'hom' from 2 for 4) = 4
又比如要將'http://tp2.sinaimg.cn/1928431341/50/5621497131/1'中的/50/替換成/180/,可以使用的方法:
1.
update t_sns_member set enlarge_image_url= overlay(profile_image_url placing '/180/' from position('/50/' in profile_image_url) for 4) where enlarge_image_url=''
2.不使用替換,substring+position+||去拼新串
update t_sns_member set
enlarge_image_url=substring(profile_image_url,0,position('/50/' in
profile_image_url))||'/180/'||substring(profile_image_url,position('/50/'
in profile_image_url)+4,char_length(profile_image_url)) where
enlarge_image_url='';
函數:position(substring in string)
說明:Location of specified substring 子串在一字符串中的位置
例子:position('om' in 'Thomas') = 3
函數:substring(string [from int] [for int])
說明:Extract substring 截取任意長度的子字符串
例子:substring('Thomas' from 2 for 3) = hom
函數:substring(string from pattern)
說明:Extract
substring matching POSIX regular expression. See Section 9.7 for more
information on pattern matching. 利用正則表達式對一字符串進行任意長度的字串的截取
例子:substring('Thomas' from '...$') = mas
函數:substring(string from pattern for escape)
說
明:Extract substring matching SQL regular expression. See Section 9.7 for
more information on pattern matching. 利於正則表達式對某類字符進行刪除,以得到子字符串
例子:trim(both 'x' from 'xTomxx') = Tom
函數:trim([leading | trailing | both] [characters] from string)
說
明:Remove the longest string containing only the characters (a space by
default) from the start/end/both ends of the string
去除盡可能長開始,結束或者兩邊的某類字符,默認為去除空白字符,當然可以自己指定,可同時指定多個要刪除的字符串
例子:trim(both 'x' from 'xTomxx') = Tom
函數:upper(string)
說明:Convert string to uppercase 將字符串轉換為大寫
例子:upper('tom') = TOM
函數:ascii(string)
說明:ASCII code of the first
character of the argument. For UTF8 returns the Unicode code point of
the character. For other multibyte encodings. the argument must be a
strictly ASCII character. 得到某一個字符的Assii值
例子:ascii('x') = 120
函數:btrim(string text [, characters text])
說
明:Remove the longest string consisting only of characters in characters
(a space by default) from the start and end of string
去除字符串兩邊的所有指定的字符,可同時指定多個字符
例子:btrim('xyxtrimyyx', 'xy') = trim
update property set memorial_no = btrim(memorial_no, ' ') where memorial_no like ' %'
或update property set memorial_no = trim(both ' ' from memorial_no) where memorial_no like ' %'
函數:chr(int)
說明:Character with the given code.
For UTF8 the argument is treated as a Unicode code point. For other
multibyte encodings the argument must designate a strictly ASCII
character. The NULL (0) character is not allowed because text data types
cannot store such bytes. 得到某ACSII值對應的字符
例子:chr(65) = A
函數:convert(string bytea, src_encoding name, dest_encoding name)
說
明:Convert string to dest_encoding. The original encoding is specified by
src_encoding. The string must be valid in this encoding. Conversions
can be defined by CREATE CONVERSION. Also there are some predefined
conversions. See Table 9-7 for available conversions. 轉換字符串編碼,指定源編碼與目標編碼
例子:convert('text_in_utf8', 'UTF8', 'LATIN1') = text_in_utf8 represented in ISO 8859-1 encoding
函數:convert_from(string bytea, src_encoding name)
說
明:Convert string to the database encoding. The original encoding is
specified by src_encoding. The string must be valid in this encoding.
轉換字符串編碼,自己要指定源編碼,目標編碼默認為數據庫指定編碼,
例子:convert_from('text_in_utf8', 'UTF8') = text_in_utf8 represented in the current database encoding
函數:convert_to(string text, dest_encoding name)
說明:Convert string to dest_encoding.轉換字符串編碼,源編碼默認為數據庫指定編碼,自己要指定目標編碼,
例子:convert_to('some text', 'UTF8') = some text represented in the UTF8 encoding
函數:decode(string text, type text)
說明:Decode binary data from string previously encoded with encode. Parameter type is same as in encode. 對字符串按指定的類型進行解碼
例子:decode('MTIzAAE=', 'base64') = 123\000\001
函數:encode(data bytea, type text)
說明:Encode
binary data to different representation. Supported types are: base64,
hex, escape. Escape merely outputs null bytes as \000 and doubles
backslashes. 與decode相反,對字符串按指定類型進行編碼
例子:encode(E'123\\000\\001', 'base64') = MTIzAAE=
函數:initcap(string)
說明:Convert the first
letter of each word to uppercase and the rest to lowercase. Words are
sequences of alphanumeric characters separated by non-alphanumeric
characters. 將字符串所有的單詞進行格式化,首字母大寫,其它為小寫
例子:initcap('hi THOMAS') = Hi Thomas
函數:length(string)
說明:Number of characters in string 講算字符串長度
例子:length('jose') = 4
函數:length(stringbytea, encoding name )
說明:Number of characters in string in the given encoding. The string must be valid in this encoding. 計算字符串長度,指定字符串使用的編碼
例子:length('jose', 'UTF8') = 4
*函數:lpad(string text, length int [, fill text])
說
明:Fill up the string to length length by prepending the characters fill
(a space by default). If the string is already longer than length then
it is truncated (on the right).
對字符串左邊進行某類字符自動填充,即不足某一長度,則在左邊自動補上指定的字符串,直至達到指定長度,可同時指定多個自動填充的字符
例子:lpad('hi', 5, 'xy') = xyxhi
函數:ltrim(string text [, characters text])
說
明:Remove the longest string containing only characters from characters
(a space by default) from the start of string
刪除字符串左邊某一些的字符,可以時指定多個要刪除的字符
例子:trim
函數:md5(string)
說明:Calculates the MD5 hash of string, returning the result in hexadecimal 將字符串進行md5編碼
例子:md5('abc') = 900150983cd24fb0 d6963f7d28e17f72
*函數:pg_client_encoding()
說明:Current client encoding name 得到pg客戶端編碼
例子:pg_client_encoding() = SQL_ASCII
*函數:quote_ident(string text)
說明:Return the
given string suitably quoted to be used as an identifier in an SQL
statement string. Quotes are added only if necessary (i.e., if the
string contains non-identifier characters or would be case-folded).
Embedded quotes are properly doubled. 對某一字符串加上兩引號
例子:quote_ident('Foo bar') = "Foo bar"
*函數:quote_literal(string text)
說明:Return the
given string suitably quoted to be used as a string literal in an SQL
statement string. Embedded single-quotes and backslashes are properly
doubled. 對字符串裡兩邊加上單引號,如果字符串裡面出現sql編碼的單個單引號,則會被表達成兩個單引號
例子:quote_literal('O\'Reilly') = 'O''Reilly'
*函數:quote_literal(value anyelement)
說明:Coerce
the given value to text and then quote it as a literal. Embedded
single-quotes and backslashes are properly doubled.
將一數值轉換為字符串,並為其兩邊加上單引號,如果數值中間出現了單引號,也會被表示成兩個單引號
例子:quote_literal(42.5) = '42.5'
函數:regexp_matches(string text, pattern text [, flags text])
說
明:Return all captured substrings resulting from matching a POSIX regular
expression against the string. See Section 9.7.3 for more information.
對字符串按正則表達式進行匹配,如果存在則會在結果數組中表示出來
例子:regexp_matches('foobarbequebaz', '(bar)(beque)') = {bar,beque}
函數:regexp_replace(string text, pattern text, replacement text [, flags text])
說明:Replace substring(s) matching a POSIX regular expression. See Section 9.7.3 for more information. 利用正則表達式對字符串進行替換
例子:regexp_replace('Thomas', '.[mN]a.', 'M') = ThM
*函數:regexp_split_to_array(string text, pattern text [, flags text ])
說明:Split string using a POSIX regular expression as the delimiter. See Section 9.7.3 for more information. 利用正則表達式將字符串分割成數組
例子:regexp_split_to_array('hello world', E'\\s+') = {hello,world}
*函數:regexp_split_to_table(string text, pattern text [, flags text])
說明:Split string using a POSIX regular expression as the delimiter. See Section 9.7.3 for more information. 利用正則表達式將字符串分割成表格
例子:regexp_split_to_table('hello world', E'\\s+') =
hello
world
(2 rows)
*函數:repeat(string text, number int)
說明:Repeat string the specified number of times 重復字符串一指定次數
例子:repeat('Pg', 4) = PgPgPgPg
函數:replace(string text, from text, to text)
說明:Replace all occurrences in string of substring from with substring to 將字符的某一子串替換成另一子串
例子:('abcdefabcdef', 'cd', 'XX') = abXXefabXXef
與overlay的功能一樣,
select overlay('http://tp2.sinaimg.cn/1928431341/50/5621497131/1' placing '/180/' from position('/50/' in 'http://tp2.sinaimg.cn/1928431341/50/5621497131/1') for 4), replace('http://tp2.sinaimg.cn/1928431341/50/5621497131/1', '/50/', '/180/');
*函數:rpad(string text, length int [, fill text])
說
明:Fill up the string to length length by appending the characters fill
(a space by default). If the string is already longer than length then
it is truncated. 對字符串進行填充,填充內容為指定的字符串
例子:rpad('hi', 5, 'xy') = hixyx
函數:rtrim(string text [, characters text])
說明:Remove the longest string containing only characters from characters (a space by default) from the end of string
去除字符串右邊指定的字符
例子:rtrim('trimxxxx', 'x') = trim
*函數:split_part(string text, delimiter text, field int)
說明:Split string on delimiter and return the given field (counting from one) 對字符串按指定子串進行分割,並返回指定的數值位置的值
例子:split_part('abc~@~def~@~ghi', '~@~', 2) = def
函數:strpos(string, substring)
說明:Location of specified substring (same as position(substring in string), but note the reversed argument order) 指定字符串在目標字符串的位置
例子:strpos('high', 'ig') = 2
與position類似
select strpos('http://tp2.sinaimg.cn/1928431341/50/5621497131/1', '/50/'), position('/50/' in 'http://tp2.sinaimg.cn/1928431341/50/5621497131/1');
函數:substr(string, from [, count])
說明:Extract substring (same as substring(string from from for count)) 截取子串
例子:substr('alphabet', 3, 2) = ph
*函數:to_ascii(string text [, encoding text])
說
明:Convert string to ASCII from another encoding (only supports
conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings)
將字符串轉換成ascii編碼字符串
例子:to_ascii('Karel') = Karel
*函數:to_hex(number int or bigint)
說明:Convert number to its equivalent hexadecimal representation 對數值進行十六進制編碼
例子:to_hex(2147483647) = 7fffffff
*函數:translate(string text, from text, to text)
說
明:Any character in string that matches a character in the from set is
replaced by the corresponding character in the to set
將字符串中某些匹配的字符替換成指定字符串,目標字符與源字符都可以同時指定多個
例子:translate('12345', '14', 'ax') = a23x5