regex


Regex with Parenthesis


I am trying to remove the following from my string:
string:
Snowden (left), whose whereabouts remain unknown, made the extraordinary claim as his father, Lon (right), told US television he intended to travel
I am using the following regex: ([(].*[)]) , but it's matching:
(left), whose whereabouts remain unknown, made the extraordinary claim as his father, Lon (right)
Which makes sense, but isn't what I want.
What can I do to solve this? Does it have something to do with greedy or lazy?
EDIT:
I am using Python:
paren = re.findall(ur'([(\u0028][^)\u0029]*[)\u0029])', text, re.UNICODE)
if paren is not None:
text = re.sub(s, '', text)
This leads to the following output:
Snowden (), whose whereabouts remain unknown, made the extraordinary claim as his father, Lon (), told US television he intended to travel
However, when I print paren.group(0) I get "(left)", meaning the parentheses are included, why is this?
Thanks.
Use the negation: ([(][^)]*[)]). This will match the opening (, then any number of characters which are not a closing ), then the closing ).
You can negate any character or set of characters in this way. To match a literal ^ caret, you can put it outside the [] character set or put it anywhere after the first character, like so: [a^bc]. It is always a good idea to read the rules of the regular expression language you are working in to know exactly what is possible and the correct syntax.
Being greedy or lazy is one rule that might not be implemented the same (if at all) in all regular expression implementations. Better to explicitly say what you want to find than to depend on a rule that is difficult to understand and debug (sometimes).
Restrict the .* to match only things that aren't parentheses:
([(][^()]*[)])
First don't need [] for a single char. Just escape the parentheses
Second, use .*? for non-greedy match
/\(.*?\)/
As pguardiario mentioned (who I upvoted), you don't need a character class, just escape the parenthesis.
His solution will work, with one caveat: if the text within parenthesis is hard-wrapped, the . won't capture \n. You need a character class for that.
My proposed solution:
\([^)]*\)
This escapes the parenthesis on either end, and will always capture whatever is within the parenthesis (unless it contains another parenthetical clause, of course).
It's a matter of style, but I prefer [(] to \( so I would use ([(][^)]*[)])
You haven't specified which language you are using. If it is Perl I would use the /x qualified to allow me to add spacing for clarity
/ ( [(] [^)]* [)] ) /x

Related Links

Converting EBNF to Regular Expression
Regex rule - allow empty field or special format [duplicate]
Using powershell to search for a pattern
Check odd number of a certain character
Regex: unexpected double replacement [duplicate]
How do I use the sed command to remove all but the first three characters in a line?
“OR” operator in regex, in express routing
Regex add tag to subtitles
Regex to extract second word from URL
Regular expressions combined with list for numbers written as words
Wordpress .htaccess
Disallow Whitespace Regex
RegEx for string replacement
Data extraction using Regular expressions
sed doesn't match lines that start with '#' if there are blank spaces or tabs before it
Filtering Splunk results based on a numerical value in the log entry

Categories

HOME
flutter
asp.net-identity
f#
embedded-linux
google-search
copy
serial-port
apache-pig
word-vba
fonts
prebuild
datetime
whitespace
akka.net
clion
librsvg
bluej
oracle-ucm
graphengine
sparkpost
apex
flow
round-robin
atmelstudio
nsmutableattributedstring
h.264
sfsafariviewcontroller
revitpythonshell
google-distancematrix-api
burp
liferay-aui
code-signing
sumifs
pdfa
c++builder-5
parallax.js
angular2-databinding
panel-data
doctrine-odm
gradient-descent
nio
windows-scripting
azure-management-api
dxgi
quicksand
free-diameter
win2d
android-studio-2.1
http-status-code-500
spring-form
gcc4.8
jce
autogen
glumpy
jcuda
dr-memory
ultratree
setwindowshookex
coldfusion-11
telecommunication
app-engine-ndb
skrollr
kinto
aldryn
uat
python-2.3
translucency
structured-programming
spatial-index
access-log
llvm-gcc
words
lwuit-list
wmv
colormatrix
libtorrent-rasterbar
llblgenpro
prudentia
datapump
chomsky-hierarchy
katta
process.start
soapheader
datatemplate
pong
xoom
program-transformation
web-based
large-teams

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App