I want to split a file into chunks with 2 words each.
$cat tmp
word1 word2 word3 word4 word5 word6 word7
$sed -e 's/word. word. /&\n/g' tmp
word1 word2
word3 word4
word5 word6
word7
sed -E 's/(word. ){2}/&\n/g' tmp
sed -e 's/\(word. \)\{2\}/&\n/g' tmp
Match the First Occurrence Only with SED
root@drbd-alice:~# cat /etc/chrony/chrony.conf server 0.debian.pool.ntp.org offline minpoll 8 server 1.debian.pool.ntp.org offline minpoll 8 server 2.debian.pool.ntp.org offline minpoll 8 server 3.debian.pool.ntp.org offline minpoll 8
option with -i, then edit the file. If you are certain your result from printing is ok we can edit the file with the following command.
# sed -i '0,/^server/s/\(^s.*\)/server 172.16.0.3 prefer iburst/' /etc/chrony/chrony.conf
GNU sed
only:
Ben Hoffstein's anwswer shows us that GNU provides an extension to the POSIX specification for sed
that allows the the following 2-address form: 0,/re/
(re
represents an arbitrary regular expression here).
0,/re/
allows the regex to match on the very first line also. In other words: such an address will create a range from the 1st line up to and including the line that matches re
- whether re
occurs on the 1st line or on any subsequent line.
- Contrast this with the POSIX-compliant form
1,/re/
, which creates a range that matches from the 1st line up to and including the line that matchesre
on subsequent lines; in other words: this will not detect the first occurrence of anre
match if it happens to occur on the 1st line and also prevents the use of shorthand//
for reuse of the most recently used regex (see next point).1
If you combine a 0,/re/
address with an s/.../.../
(substitution) call that uses the same regular expression, your command will effectively only perform the substitution on the first line that matches re
.sed
provides a convenient shortcut for reusing the most recently applied regular expression: an empty delimiter pair, //
.
$ sed '0,/foo/ s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar # only 1st match of 'foo' replaced
Unrelated
2nd foo
3rd foo
A POSIX-features-only sed
such as BSD (macOS) sed
(will also work with GNU sed
):
Since 0,/re/
cannot be used and the form 1,/re/
will not detect re
if it happens to occur on the very first line (see above), special handling for the 1st line is required.
MikhailVS's answer mentions the technique, put into a concrete example here:
$ sed -e '1 s/foo/bar/; t' -e '1,// s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar # only 1st match of 'foo' replaced
Unrelated
2nd foo
3rd foo
Note:
-
The empty regex
//
shortcut is employed twice here: once for the endpoint of the range, and once in thes
call; in both cases, regexfoo
is implicitly reused, allowing us not to have to duplicate it, which makes both for shorter and more maintainable code. -
POSIX
sed
needs actual newlines after certain functions, such as after the name of a label or even its omission, as is the case witht
here; strategically splitting the script into multiple-e
options is an alternative to using an actual newlines: end each-e
script chunk where a newline would normally need to go.
1 s/foo/bar/
replaces foo
on the 1st line only, if found there. If so, t
branches to the end of the script (skips remaining commands on the line). (The t
function branches to a label only if the most recent s
call performed an actual substitution; in the absence of a label, as is the case here, the end of the script is branched to).
When that happens, range address 1,//
, which normally finds the first occurrence starting from line 2, will not match, and the range will not be processed, because the address is evaluated when the current line is already 2
.
Conversely, if there's no match on the 1st line, 1,//
will be entered, and will find the true first match.
The net effect is the same as with GNU sed
's 0,/re/
: only the first occurrence is replaced, whether it occurs on the 1st line or any other.
NON-range approaches
potong's answer demonstrates loop techniques that bypass the need for a range; since he uses GNU sed
syntax, here are the POSIX-compliant equivalents:
Loop technique 1: On first match, perform the substitution, then enter a loop that simply prints the remaining lines as-is:
$ sed -e '/foo/ {s//bar/; ' -e ':a' -e '$!{n;ba' -e '};}' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar
Unrelated
2nd foo
3rd foo
Loop technique 2, for smallish files only: read the entire input into memory, then perform a single substitution on it.
$ sed -e ':a' -e '$!{N;ba' -e '}; s/foo/bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
1st bar
Unrelated
2nd foo
3rd foo
1 1.61803 provides examples of what happens with 1,/re/
, with and without a subsequent s//
:
- sed '1,/foo/ s/foo/bar/' <<<$'1foo\n2foo'
yields $'1bar\n2bar'
; i.e., both lines were updated, because line number 1
matches the 1st line, and regex /foo/
- the end of the range - is then only looked for starting on the next line. Therefore, both lines are selected in this case, and the s/foo/bar/
substitution is performed on both of them.
- sed '1,/foo/ s//bar/' <<<$'1foo\n2foo\n3foo'
fails: with sed: first RE may not be empty
(BSD/macOS) and sed: -e expression #1, char 0: no previous regular expression
(GNU), because, at the time the 1st line is being processed (due to line number 1
starting the range), no regex has been applied yet, so //
doesn't refer to anything.
With the exception of GNU sed
's special 0,/re/
syntax, any range that starts with a line number effectively precludes use of //
.
You could use awk to do something similar..
awk '/#include/ && !done { print "#include \"newfile.h\""; done=1;}; 1;' file.c
Explanation:
/#include/ && !done
Runs the action statement between {} when the line matches "#include" and we haven't already processed it.
{print "#include \"newfile.h\""; done=1;}
This prints #include "newfile.h", we need to escape the quotes. Then we set the done variable to 1, so we don't add more includes.
1;
This means "print out the line" - an empty action defaults to print $0, which prints out the whole line. A one liner and easier to understand than sed IMO :-)