Author Topic: Another Stupid Gawk / Grep / Sed Question  (Read 889 times)

Ben

  • Administrator
  • Senior Member
  • *****
  • Posts: 46,095
  • I'm an Extremist!
Another Stupid Gawk / Grep / Sed Question
« on: February 08, 2007, 06:39:37 PM »
I have another stupid (well several of them) question for you text manipulating experts. I have a bazillion line file I'm working on that has some funky stuff I want to remove, and then I want to add an end of line character on each line.

I want to do the following in either Gawk or Grep:

1) Remove everything to the left of a ":" on each line (this exists on every line).
2) Remove everything to the right of a "<" on each line (this exists variably on some lines).

edit: I also need to remove the ":" and "<" in each instance.

Then I think (?) I want to use Sed to add a ";" to the end of each line as a delimiter for import into a spreadsheet. If it's easier in Gawk or Grep I'm down.

I hate being a programming leech, but well, here I am.  smiley

Thanks in advance for any assistance!
"I'm a foolish old man that has been drawn into a wild goose chase by a harpy in trousers and a nincompoop."

Phantom Warrior

  • friend
  • Senior Member
  • ***
  • Posts: 926
Re: Another Stupid Gawk / Grep / Sed Question
« Reply #1 on: February 08, 2007, 10:33:24 PM »
I'm not a huge programming guru, despite a CS degree  undecided, but this cries out to me "Perl script!"  I'm not sure exactly what environment you are working in and I've never used Gawk/Grep/Sed, but a short Perl script would probably solve your problems, abstractly speaking...

roo_ster

  • Kakistocracy--It's What's For Dinner.
  • friend
  • Senior Member
  • ***
  • Posts: 21,225
  • Hoist the black flag, and begin slitting throats
Re: Another Stupid Gawk / Grep / Sed Question
« Reply #2 on: February 09, 2007, 07:47:49 AM »
Perl will do the trick, but the development time of awk/sed is much less.

Also, awk & sed scripts are easily converted into perl at a later time using a2p & s2p.  This is nice if speed of processing is an issue or I/O is an issue.

My usual is:
1. Get tired of manual edits
2. Develop tcsh/awk/sed script that does what I want
3. If tcsh/awk/sed takes too long or just too much to process..
   a. Convert awk & sed to Perl & hack them together using  the bit of Perl I know
   b. Replace the awk & sed calls in my tcsh script with the perl script

awk/gawk can do it all, BenW.  I would reccomend the O'Reilly sed & awk book, found on B&N discount shelves in earlier printings for a song.  You don't have it?  Run, do not walk, to get it.

Here is a csh script that calls on an awk & sed scripts.

Between the two (reference & example) I think you can manage it.

Quote from: zzDET.csh
Code:
#!/bin/csh

set utl=$0:h
set temp1=temp1.$$
set temp2=temp2.$$

foreach dirc ($argv
  • )

  set tail = $dirc:t
  cp $dirc/DETECTION $temp1
  awk -f $utl/zget.level.awk $temp1 > $temp2
  sed -f $utl/zrep.level.sed $temp2 > "det_""$tail"".txt"
end
rm $temp1
rm $temp2


grep -v "^L" det* > DET_Summary.txt




Quote from: zget.level.awk
Code:
#
BEGIN {prntflag = "NO" }
#
# set print flag to YES when "RED TARGET" is hit
($4 ~ /RED/) && ($5 ~ /TARGET/)  { prntflag = "YES" }
#
# set print flag to NO when "BLUE TARGET" is hit
($4 ~ /BLUE/) && ($5 ~ /TARGET/)  { prntflag = "NO" }
#
#
# print all appropriate lines
{if ( prntflag == "YES" ) { print($0) } }
This translates as, "Print everything between the YES & NO conditions."
Here, you might want to set your prntflag to YES when you hit the colon & set it to No when it is a less than.



Quote from: zrep.level.sed
Code:
#
# replace every "/" with a couple of spaces
s/\//   /g
# replace every "TOTAL " with " TOTAL"
s/TYPE    TYPE         TOTAL                                RED TARGET/DETECTOR SENSOR TOTAL_NUM/g
s/TYPE    TYPE         AVERAGE                              RED TARGET/DETECTOR SENSOR AVG_RNG/g
s/TOTAL /ALL SENSORS/g
#
# delete all blank lines
/^$/d
# delete lines with "TITLE" , "SUMMARY" , "formfeed" & "TARGET"
/TITLE/d
/SUMMARY/d
# /TARGET/d
/FIRER/d
# /^L/d

Appending something to the end is easy, but I can not recall.  You may want to include it in the awk script.

Good luck.
Regards,

roo_ster

“Fallacies do not cease to be fallacies because they become fashions.”
----G.K. Chesterton

Ben

  • Administrator
  • Senior Member
  • *****
  • Posts: 46,095
  • I'm an Extremist!
Re: Another Stupid Gawk / Grep / Sed Question
« Reply #3 on: February 09, 2007, 07:53:44 AM »
Thanks jfruser! I'm gonna give this a shot today.
"I'm a foolish old man that has been drawn into a wild goose chase by a harpy in trousers and a nincompoop."

Telperion

  • friend
  • Member
  • ***
  • Posts: 140
Re: Another Stupid Gawk / Grep / Sed Question
« Reply #4 on: February 09, 2007, 02:59:46 PM »
sed 's/^.*://' < in | sed 's/<.*$/;/' > out

Oops that won't work if the < is missing on the line, better do:

sed 's/^.*://' < in | sed 's/<.*$//' | sed 's/.*$/&;/' > out

Ben

  • Administrator
  • Senior Member
  • *****
  • Posts: 46,095
  • I'm an Extremist!
Re: Another Stupid Gawk / Grep / Sed Question
« Reply #5 on: February 13, 2007, 06:20:54 AM »
Thanks again for the help guys -- I ended up using Telperion's sed script which worked nicely.

I also ordered the AWK / Sed book, so I don't have to leech help (as much) anymore. Smiley
"I'm a foolish old man that has been drawn into a wild goose chase by a harpy in trousers and a nincompoop."