程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python regular expression explanation

編輯:Python

Introduction

Regular matching is often involved in daily development scenarios , For example, the device collects information , Filter profile , Filter relevant web page elements, etc , All for Python Regular matching in Re modular , There are many places to summarize and sort out . This article mainly summarizes some functions that are often used , And the pits encountered in the process of use .

Regular expressions

Experience in using regular matching :

  • A large number of data can be preprocessed , Remove some redundant symbols , For example, line changing. , If multiple spaces appear , You can replace it with a single space .
  • When obtaining data for processing , Get the specified data as much as possible , Reduce other data interference , At the same time, it can also improve the transmission efficiency ;
  • You can set the start and end characters to filter multiple child elements

About regular expression syntax , I won't repeat , You can refer to the online documentation , There is no summary here ;

Re modular

re The module is Python A module used to handle regular expression matching operations .

Re Module common functions :

re.search(pattern, string, flags=0)
import re
text= "Hello, World!"
re.search("[A-Z]", text)

remarks :

  • pattern For regular expressions ,string For the string that needs to be matched ,flags Is the flag of the regular expression ;
  • search The function scans the entire string to find the first position that matches the regular expression , And return the corresponding matching object , If there is no match, it returns None;

re.match(pattern, string, flags=0)
re.fullmatch(pattern, string, flags=0)

import re
text="Hello,World"
re.match("[a-z]", text)
re.fullmatch("\S+", text)

remarks :

  • match The function matches the regular expression from the beginning , If one or more characters at the beginning match the regular expression style , Returns a matching object , conversely , return None;
  • Pay attention to distinguish between match and search,match The function checks the beginning of a string ,search The function is to check any position of a string ;
  • fullmatch Yes, if the whole string All match to regular expressions , Return a corresponding matching object , Otherwise, return one None;

re.split(pattern, string, maxsplit=0, flags=0)

import re
text = "aJ33Sjd3231ssfj22323SSdjdSSSDddss"
re.split("([0-9]+)", text)
re.split("[0-9]+", text)

remarks :

  • split Functions are separated by regular expressions string, If parentheses can be detected in regular expressions , The split string will remain in the list ;
  • maxsplit, The maximum number of splits , After splitting, all the remaining strings will be returned to the last element in the list ;

re.findall(pattern, string, flags=0)
re.finditer(pattern, string, flags=0)

import re
text = "aJ33Sjd3231ssfj22323SSdjdSSSDddss"
a = re.findall("[0-9]+", text)
print(a)
b = re.finditer("[0-9]+", text)
for i in b:
print(i.group())

remarks :

  • findall() function ,string Scan from left to right , Match regular expression , All matched are arranged in order to form a list and return to ;
  • finditer() function ,string Scan from left to right , Match regular expression , Arrange the results in order and return to an iterator iterator, The iterator holds

A match object ;

re.sub(pattern, repl, string, count=0, flags=0)
re.subn(pattern, repl, string, count=0, flags=0)

text = "aJ33Sjd3231ssfj22323SSdjdSSSDddss"
a = re.sub("[0-9]", "*", text)
b = re.subn("[0-9]", "*", text)
print(a)
print(b)

remarks :

  • sub Function USES repl Replace string The result of each match in , Then return the substitute result ,count The parameter represents the number of replacements , Replace all by default ;
  • subn Function behavior sub identical , But it returns a tuple ( character string , Number of replacements )

Re Regular expression objects

re.compile(pattern, flags=0)

import re
prog = re.compile("\<div[\s\S]*?class=\"([\s\S]*?)\"[\s\S]*?\>")
text = '<div class="tab" >'
prog.search(text)
prog.findall(text)

remarks :

  • compile Function can compile a regular expression into a regular expression object , Match search through the methods provided by the object ;
  • Generally, when you need to use this regular expression multiple times , Use re.compile() And save this regular object for reuse , Can make the program more efficient ;
  • The methods provided by regular expression objects can be seen in the above Re Common functions ;

Re A match object

When a common function or regular expression object matches the returned _sre.SRE_Match Objects are called matching objects

Match.group([gourp1,…])

Match.groups(default=None)
Match.groupdict(default=None)

import re
a = "Hello, World, root"
b = re.search("(\w+), (\w+), (?P<name>\w+)", a)
print(b.group(0))
print(b.group(1))
print(b.groups())
print(b.groupdict())

remarks :

  • group Method returns one or more matching subgroups , That is, brackets () The combination of , The default is to return the entire match ;
  • groups Method , Returns a tuple , Contains all matching subgroups ;
  • groupdict Method , Return a dictionary , Contains all named subgroups ;

Regular expressions have a wide range of applications , I hope this article can help you learn Python Help !


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved