Archive for the ‘linux’ Category
join variable length multiline data entries with sed
% cat test.txt data1-1 data1-2 data1-3 closing-form data2-1 data2-2 data2-3 data2-4 closing-form data3-1 data3-2 closing-form % cat test.txt | sed -n -e ':x;N;/\nclosing-form$/!bx;s/\n/;/g;p' data1-1;data1-2;data1-3;closing-form data2-1;data2-2;data2-3;data2-4;closing-form data3-1;data3-2;closing-form %
A precodition is that there’s no empty data entry. If there is, you can introduce helper entries:
% cat testt.txt data1-1 data1-2 closing-form closing-form closing-form data3-1 data3-2 closing-form % cat testt.txt | sed '/^closing-form$/ihelper-entry' data1-1 data1-2 helper-entry closing-form helper-entry closing-form helper-entry closing-form data3-1 data3-2 helper-entry closing-form %
manipulate k-th column in file with sed & friends
$ echo $0 bash $
Which means that all the stuff below holds at least when using Bash. One of the important features in it is “Process Substitution”. Not so efficient (due to extra bash processes spawned), but I just don’t care at the moment..
The first use case is to manipulate the k-th column of a file. Let’s see our test file:
$ cat -A if1.txt a^Ib^Ic^Id^Ie^If^Ig^Ih$ b^Ic^Id^Ie^If^Ig^Ih^Ia$ c^Id^Ie^If^Ig^Ih^Ia^Ib$ d^Ie^If^Ig^Ih^Ia^Ib^Ic$ e^If^Ig^Ih^Ia^Ib^Ic^Id$ f^Ig^Ih^Ia^Ib^Ic^Id^Ie$ g^Ih^Ia^Ib^Ic^Id^Ie^If$ h^Ia^Ib^Ic^Id^Ie^If^Ig$ $ cat if1.txt a b c d e f g h b c d e f g h a c d e f g h a b d e f g h a b c e f g h a b c d f g h a b c d e g h a b c d e f h a b c d e f g $
The problem is to search and replace with sed, but only in a given column.
The first solution is done by using paste, cut.
$ paste <(cut -f-4 if1.txt) <(cut -f5 if1.txt | sed 's/h/x/g') <(cut -f6- if1.txt) a b c d e f g h b c d e f g h a c d e f g h a b d e f g x a b c e f g h a b c d f g h a b c d e g h a b c d e f h a b c d e f g $
The second one uses the bash language and its read command which is able to fill an array with values read from a line.
$ cat if1.txt | while read -a line; do
> line[4]=$(echo ${line[4]} | sed 's/h/x/')
> echo -e ${line[*]} | sed 's/\ /\t/g'
> done
a b c d e f g h
b c d e f g h a
c d e f g h a b
d e f g x a b c
e f g h a b c d
f g h a b c d e
g h a b c d e f
h a b c d e f g
$
Note that this multiline command can be written into a single line as well (obviously by removing the secondary prompt characters).
Now what if you want to do computations based on other columns. Here’s the same approach for this (removing 2nd and 4th columns and introducing a new 3rd one which should be the sum of the two deleted ones):
$ cat -A if2.txt
a^I1^Id^I2$
b^I3^Ie^I4$
c^I5^If^I6$
$ cat if2.txt
a 1 d 2
b 3 e 4
c 5 f 6
$ paste <(cut -f1,3 if2.txt) <(cut -f2,4 if2.txt | sed 's/$/ + p/' | dc)
a d 3
b e 7
c f 11
$ cat if2.txt | while read -a line; do
> echo -e ${line[0]}\\t${line[2]}\\t$(echo ${line[1]} ${line[3]} + p | dc)
> done
a d 3
b e 7
c f 11
$
multiconnection download with scsh
There’s a server (of a radio) I download audio files from. The thing is, that the bandwidth for a connection is limited to ~24Kb/sec nowdays (several years ago there wasn’t any limit). By getting the file with multiple connections and concurrently solves the problem somewhat. Unfortunately the number of connections from a given IP address is also limited to ~15. Anyway, let’s say ~240Kb/sec (when using 10 connections) is much more than 24Kb/sec.
Parts of a file can be obtained by Curl. I decided to use The Scheme Shell to implement my idea due to its thread support and strong relationship with command line tools (as being a shell).
The solution is a fast hack. Let’s see..
$ ls
getItFast.scm getItFast.scm~
$ ./getItFast.scm http://someserver/2200.mp3
$ ls
2200.mp3.00 2200.mp3.08 2200.mp3.16 2200.mp3.24 2200.mp3.32 2200.mp3.40
2200.mp3.01 2200.mp3.09 2200.mp3.17 2200.mp3.25 2200.mp3.33 2200.mp3.41
2200.mp3.02 2200.mp3.10 2200.mp3.18 2200.mp3.26 2200.mp3.34 2200.mp3.42
2200.mp3.03 2200.mp3.11 2200.mp3.19 2200.mp3.27 2200.mp3.35 getItFast.scm
2200.mp3.04 2200.mp3.12 2200.mp3.20 2200.mp3.28 2200.mp3.36 getItFast.scm~
2200.mp3.05 2200.mp3.13 2200.mp3.21 2200.mp3.29 2200.mp3.37
2200.mp3.06 2200.mp3.14 2200.mp3.22 2200.mp3.30 2200.mp3.38
2200.mp3.07 2200.mp3.15 2200.mp3.23 2200.mp3.31 2200.mp3.39
$ cat 2200.mp3.* > 2200.mp3
$ rm 2200.mp3.*
$ ls
2200.mp3 getItFast.scm getItFast.scm~
$ cat getItFast.scm
#!/usr/bin/scsh \
-o placeholders -o threads -o locks -s
!#
; this many thread will be started,
; each of'em represents a connection
(define POOL-SIZE 10)
; the length of a chunk in bytes
; (downloaded with one connection)
(define STEP 1000000)
(define URL (argv 1))
(define FNAME (file-name-nondirectory URL))
(define url-content-length
(lambda (url)
(string->number
(cadr ((infix-splitter (rx (+ white)))
(run/string
(| (curl -s -S -I ,url)
(grep "Content-Length"))))))))
(define LENGTH (url-content-length URL))
(define make-queue
(lambda (data-list)
(let ((lock (make-lock)))
(lambda ()
(let ((re '()))
(obtain-lock lock)
(if (null? data-list)
(set! re '())
(begin
(set! re (car data-list))
(set! data-list (cdr data-list))))
(release-lock lock)
re)))))
(define range-string
(lambda (beg end)
(let ((begs (number->string beg))
(ends (number->string end)))
(string-append begs "-" ends))))
(define get-part
(lambda (beg end fn)
(run (curl -o ,fn -s -S -r
,(range-string beg end) ,URL))))
; this long is the number field
; in the filenames of parts
(define PADLEN
(string-length
(number->string
(ceiling
(/ LENGTH STEP)))))
(define file-counter-string
(lambda (i)
(let loop ((s (number->string i)))
(if (<= PADLEN (string-length s))
s
(loop (string-append "0" s))))))
(define counted-file-name
(lambda (i)
(string-append FNAME
"."
(file-counter-string i))))
; this contains the works to do
; (work ~ download a specific chunk)
; e.g. ((0 999999 "foo.mp3.00") (1000000 1999999 "foo.mp3.01") ... )
(define QUEUE
(make-queue
(let loop ((work-list '()) (low 0) (upp (- STEP 1)) (counter 0))
(if (> low LENGTH)
work-list
(loop (cons (list low upp (counted-file-name counter)) work-list)
(+ upp 1)
(min LENGTH (+ upp STEP))
(+ counter 1))))))
(define signal-thread-finish
(lambda (waiter)
(placeholder-set! waiter #f)))
(define start-worker
(lambda ()
(let ((waiter (make-placeholder)))
(spawn
(lambda ()
(let loop ()
(let ((work (QUEUE)))
(if (null? work)
(signal-thread-finish waiter)
(begin
(apply get-part work)
(loop)))))))
waiter)))
(let loop ((i POOL-SIZE) (waiters '()))
(if (= i 0)
(map placeholder-value waiters)
(loop (- i 1) (cons (start-worker) waiters))))
$
Useful links:
join & filter multiline data records with sed
Grouping with sed:
$ cat nrs.txt 01 02 03 04 05 06 07 08 09 10 11 12 $ cat nrs.txt | sed -n -e 'N;s/\n/ /g;p' 01 02 03 04 05 06 07 08 09 10 11 12 $ cat nrs.txt | sed -n -e 'N;N;s/\n/ /g;p' 01 02 03 04 05 06 07 08 09 10 11 12 $ cat nrs.txt | sed -n -e 'N;N;N;s/\n/ /g;p' 01 02 03 04 05 06 07 08 09 10 11 12 $ cat nrs.txt | sed -n -e 'N;N;N;N;N;s/\n/ /g;p' 01 02 03 04 05 06 07 08 09 10 11 12 $
Good for the following task:
$ cat entries.txt entry-1-data-1 entry-1-data-2 entry-2-data-1 entry-2-data-2 $ cat entries.txt | sed -n -e 'N;s/\n/ /;p' entry-1-data-1 entry-1-data-2 entry-2-data-1 entry-2-data-2 $
Two long groups, removing elements (1st, 2nd):
$ cat nrs.txt | sed -n -e 'p;n' 01 03 05 07 09 11 $ cat nrs.txt | sed -n -e 'n;p' 02 04 06 08 10 12 $
In general case you have groups of length _k_ and the starting pattern is ‘n;n;…;n’. There’s k-1 number of letter n here. You can place letter p around the n-s, which is exactly k possibility. If there’s a p in position k0 then the k0-th element will be printed out.
So if you have 6 long groups and you want every fifth element:
$ cat nrs.txt | sed -n -e 'n;n;n;n;p;n' 05 11 $
cumulating minutes begun
$ cat seconds.txt 120 123 $ cat seconds.txt | sed 's/$/ 60 ~ 0 !=r +/' | sed '1i[1+] sr 0' | sed '$ap' | dc 5 $
connecting grepping into sed
I use grep and sed mostly as the following pattern
cat file | grep some-pattern | sed s/other-pattern/replacement/
But what if some-pattern and other-pattern is the same, moreover you want to refer groups in replacement. Here’s what sed offers for this:
cat file | sed -e 's/pattern/replacement/p; d'
removing Hungarian accents with sed on XP
I’m up to create a backup from the family photo collection. To avoid further issues with character encoding I decided to remove accents from characters in file names. This is the sed file I wrote:
s/\o341/a/g s/\o355/i/g s/\o373/u/g s/\o365/o/g s/\o374/u/g s/\o366/o/g s/\o372/u/g s/\o363/o/g s/\o351/e/g s/\o301/A/g s/\o315/I/g s/\o333/U/g s/\o325/O/g s/\o334/U/g s/\o326/O/g s/\o332/U/g s/\o323/O/g s/\o311/E/g
file renamer improvement
I decided to create a context menu for files to remove spaces or unwanted characters from the name. I use nautilus, you are able to do so by nautilus-actions package. We need the script which will run in case the menupoint is selected. If multiple files are selected, a space separated list of them will be the parameter. My script is like this
#!/usr/bin/zsh
pattern="[^a-zA-Z0-9-.]"
for (( i=1 ; i<=$# ; i+=1 ))
do
source=$*[$i]
target=${source:h}/${${source:t}//${~pattern}/_}
if [[ ! -a $target ]] then
mv "$source" "$target"
fi
done
file renamer
Let’s suppose you have a bunch of files with various characters in filename which you want to get rid of. I mean you want to eliminate those characters, not the files.
afroid-laptop% paste <(ls -1 | sed -e 's/^\(.*\)$/"\1"/') <(ls -1 | sed -e 's/\ / _/g') | sed -e 's/^/mv /' mv "01 Track 01 13.mp3" 01_Track_01_13.mp3 mv "02 Track 02 22.mp3" 02_Track_02_22.mp3 mv "03 Track 03 32.mp3" 03_Track_03_32.mp3 mv "04 Track 04 42.mp3" 04_Track_04_42.mp3 mv "05 Track 05 52.mp3" 05_Track_05_52.mp3 mv "06 Track 06 62.mp3" 06_Track_06_62.mp3 mv "07 Track 07 72.mp3" 07_Track_07_72.mp3 mv "08 Track 08 82.mp3" 08_Track_08_82.mp3 mv "09 Track 09 91.mp3" 09_Track_09_91.mp3 afroid-laptop% paste <(ls -1 | sed -e 's/^\(.*\)$/"\1"/') <(ls -1 | sed -e 's/\ / _/g') | sed -e 's/^/mv /' | sh afroid-laptop% ls 01_Track_01_13.mp3 04_Track_04_42.mp3 07_Track_07_72.mp3 02_Track_02_22.mp3 05_Track_05_52.mp3 08_Track_08_82.mp3 03_Track_03_32.mp3 06_Track_06_62.mp3 09_Track_09_91.mp3 afroid-laptop%
useful key shortcuts on ubuntu
I’m evil sometimes ; )
As a first scenario, I wanted to save some pages of a book provided by a flash site. This site hides the pages once seen because of copyrigth issues. I had to create a bunch of screenshots : ) I also wanted to minimize clicks or key strokes to use during a screenshot.
On ubuntu, start up ‘gconf-editor’ from a terminal. Go to path ‘apps/metacity/keybinding_commands/command_1′. Place there ‘/somepath/sshot.sh’. Content of that is like this:
afroid-laptop% cat sshot.sh import -window root -quality 90 /pathtoscreenshotscollection/`date +%Y%m%d%H%M%S`.png afroid-laptop%
Later on, under ‘global_keybindings/command_1′, insert the string e.g. ‘<Alt>T’.
Obviously you can change names and bindings as it’s appropriate for you.
Next one is to get descriptions for words form a dictionary.
Command to place is: ‘firefox “http://pewebdic2.cw.idm.fr/popup/popupmode.html?search_str=”`xsel`’.
Of course you have to install the xsel package before usage. Later you only select the word with your mouse and press F8 for example and a page appears with the word from that dictionary…