Overview
A practical and portable regular expression reference for real-world use.
No flag syntax (e.g., (?s), (?m), (?i)) is used to ensure compatibility across OS and tools.
Only safe, environment-independent patterns are included.
Newline Codes and Platform Differences
| OS |
Newline Code |
Recommended Pattern |
Key Point |
| Windows |
\r\n |
\r?\n |
Universal pattern for CRLF/LF |
| macOS / Linux |
\n |
\r?\n |
Compatible across all platforms |
Key Notes
- Use
\r?\n to safely detect or replace newlines on Windows/macOS/Linux.
- To match blocks including newlines, use
(?:.|\r|\n) since . alone excludes line breaks.
Part 1: Basic Syntax Overview
1-1. Character Classes and Ranges
| Syntax |
Meaning |
Example |
Match |
. |
Any single char except newline |
a.c / abc |
abc |
[abc] |
One of a, b, or c |
bag |
a |
[^0-9] |
Any non-digit |
a1 |
a |
[A-Z0-9] |
Uppercase letter or digit |
X8z |
X,8 |
1-2. Quantifiers
| Syntax |
Meaning |
Example |
Match |
* |
Zero or more |
goooogle |
goooo |
+ |
One or more |
google |
oo |
? |
Zero or one |
color colour |
Both |
{n} |
Exactly n times |
a{3} / aaa |
aaa |
{n,} |
n or more |
a{2,} / aaaa |
aaaa |
{n,m} |
Between n and m |
a{2,4} / aaaaa |
aaaa |
1-3. Anchors
| Syntax |
Meaning |
Example |
Match |
^abc |
Line start |
abc\nzabc |
abc (line 1) |
abc$ |
Line end |
zabc\nabc |
Both line ends |
\bword\b |
Word boundary |
word words |
word |
\Bing |
Non-boundary |
ringing |
second ing |
1-4. Shortcuts
| Syntax |
Meaning |
Example |
Match |
\d |
Digit [0-9] |
ver2.10 |
2,10 |
\D |
Non-digit |
a1 |
a |
\w |
Alphanumeric + _ |
a_b-1 |
a_b |
\W |
Non-alphanumeric |
a# |
# |
\s |
Whitespace (tab/newline etc.) |
a b |
space |
\S |
Non-whitespace |
a b |
a,b |
\t |
Tab |
a\tb |
a[TAB]b |
1-5. Escapes and Special Characters
| Syntax |
Meaning |
Example |
Match |
\. |
Literal dot |
a.c |
a.c |
\* |
Asterisk |
a*b |
a*b |
\+ |
Plus sign |
a+b |
a+b |
\? |
Question mark |
what? |
what? |
\( \) |
Parentheses |
(test) |
(test) |
| |
Pipe |
`a |
b` |
\\ |
Backslash |
C:\\path |
\\ |
\^ |
Caret |
^abc |
^abc |
\$ |
Dollar |
total$ |
$ |
\[ \] |
Brackets |
[abc] |
[abc] |
\{ \} |
Braces |
{a,b} |
{a,b} |
Part 2: Grouping, Alternation, Lookaround
| Syntax |
Purpose |
Example |
Match |
(abc)+ |
Repeat as group |
abcabcx |
abcabc |
| `(?:jpg |
png)` |
Non-capturing OR |
file.png |
| `foo |
bar` |
OR condition |
bar |
\d+(?=円) |
Numbers before “円” |
合計100円 |
100 |
^(?!.*error).* |
Line not containing “error” |
ok\nerror |
ok |
(?<=¥)\d+ |
Number after “¥” |
¥300 |
300 |
(?<!Mr\.)\s[A-Z] |
Uppercase not after “Mr.” |
Ms. Alice |
A |
Part 3: Matching Multi-line Text
| Use Case |
Pattern |
Example |
Match |
| HTML block |
`<div>(?:. |
\r |
\n)*?</div>` |
| Log entry |
`^[\d{4}-\d{2}-\d{2} [\d:]+](?:. |
\r |
\n)*?(?=^[\d{4}-\d{2}-\d{2} |
| Markdown code block |
“ “`(?:. |
\r |
\n)*?“` “ |
Comment (/*…*/) |
`/*(?:. |
\r |
\n)*?*/` |
Part 4: Pattern Library (Filtering / Extraction)
| Purpose |
Pattern |
Example |
Match |
| Digits only |
^\d+$ |
123 |
123 |
| Alphanumeric |
^[A-Za-z0-9]+$ |
user01 |
user01 |
| Email (simple) |
^[\w.-]+@[\w.-]+\.[A-Za-z]{2,}$ |
a@b.com |
a@b.com |
| URL |
https?://[\w.-]+\.[A-Za-z]{2,}(/[\w./?=&%-]*)? |
https://ex.com/a |
https://ex.com/a |
| ISO date |
\d{4}-\d{2}-\d{2} |
2025-10-30 |
2025-10-30 |
| Intl. phone |
\+\d{1,3}[\s-]?\d{1,14} |
+81 90 1234 5678 |
full |
| Strong password |
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$ |
Abcd1234 |
full |
| HTML comment |
`<!–(?:. |
\r |
\n)*?–>` |
| Lines without “error” |
^(?!.*error).* |
ok\nerror |
ok |
Part 5: Replacement Examples
| Task |
Pattern |
Replace |
Input |
Output |
| Reverse words |
(\w+)\s+(\w+) |
${2} ${1} |
John Doe |
Doe John |
| Change delimiter |
; |
, |
a;b;c |
a,b,c |
| Normalize spaces |
\s{2,} |
|
a b |
a b |
| Remove HTML tags |
<[^>]+> |
“ |
<p>a</p> |
a |
| Trim spaces |
`^\s+ |
\s+$` |
“ |
a |
| Newline → space |
\r?\n |
|
a\nb |
a b |
| Remove end comment |
//.*$ |
“ |
x=1;//note |
x=1; |
| Unify date format |
(\d{4})/(\d{2})/(\d{2}) |
${1}-${2}-${3} |
2025/10/30 |
2025-10-30 |
| Compress duplicate lines |
^(.*)(\r?\n\1)+$ |
${1} |
repeated lines |
single line |
Summary
Regular expressions are a cross-platform and multilingual tool for text processing, scripting, log parsing, and data cleanup.
Using flag-independent, portable patterns ensures stability across diverse environments and editors.