Scratchpad

Iteration in bash

(, , — )

4 Jan. 2008

I'm a terrible programmer, mainly because I only try to do it when I really need something, instead of just sequestering myself in a room with cheesy puffs and Dr. Pepper for 2 weeks to just do it already. So I find myself having to keep notes when I do anything successfully lest I forget when I need to program again 6 months later. Even the really, really simple shit. Sigh.

Changing file names quickly in a Linux directory with a billion files:


for i in $(ls -1 s_*.gif);
do (( n++ )); mv $i file$n.gif ;
done

Combatting spam caused by mailing lists

(, , — )

12 Sep. 2007

I finally got off my dead butt and decided to play with procmail for my spam. Is it just me, or has spam really increased lately? The SANS storm center leads me to believe spam has actually increased. Stupid NFL spam. Stupid stocks.

Anywho...so, procmail. Today I discovered a really, really, really awesome spam fighting combination, which is probably super obvious, but having just implemented it I am only now basking in its glory. I've been using the first 3 steps below for a long time now, but for step 4 I was using really elaborate Pine filters. The Pine filters were pretty weak, but replacing them with a single procmail recipe really made this killer.

  • (1) Create a personal e-mail account and share it only with people that you trust implicitly. Do not share it with people whom you believe are liable to get viruses on their computers (i.e. your parents and friends who like to open Britney Spears attachments). Do not ever, ever, ever use it to sign up for things on the internet. Never enter it into any form whatsoever. Do not use it on a website. Protect it with your life.
  • (2) Create a second e-mail account and do all the things I cautioned against in #1, but use this new, "trash" account to do them.
  • (3) Create a third account and use it only to sign up for mailing and discussion lists, if you are into that sort of thing.
  • (4) Create the following procmail recipe:

    :O:
    * ^To: e-mail.address@rule3above.com
    spam

So, in plain English, signing up for e-mail lists is a sure guarantee that you will receive spam. But....e-mail lists send the messages "To" the list, not "to" your e-mail address. That means that if the e-mail address you use for mailing lists appears in the "to" field of an e-mail, it has a 99% chance of being spam. The recipe above shunts all e-mails in this category to a folder named "spam," where you can quickly skim through to make sure none of the messages were for you. It also means that virtually no actual spam will remain in your inbox, at least not that is for your mailing list address. Spam created by moronic friends who open e-mail attachments would require a different rule.

I love this recipe.

Slightly improved KTRU Top 35 list

(, , — )

4 Aug. 2007

I realized my last list cut out a huge swath of much older Top 35 lists, so I've modified my script slightly. It's still ugly and definitely not perfect, but, really, how much time do I have to play with perfecting this (don't ask)?


wget -w3 -r -l1 -IOLDLIST --no-parent http://bang.rice.edu/top35archive.shtml


rm test.txt; for i in * ; do sed -n '/op 35:/,$p' $i >> test.txt ; done


sed 's/<br>/%/g' test.txt | tr '%' 'backslashn' | tr "[:upper:]" "[:lower:]" | egrep -v "top 35" | sed -e 's/- /:: /g; s/ / / :: /g; s/   / :: /g' | egrep '::' | sed -e 's/^[0-9+] :: //g; s/^.[0-9] :: //g; s/^[0-9+]. //g; s/^.[0-9]. //g; s/^ //g' | sed -e :a -e '$!N;s/backslashn[^a-z0-9]/ /;ta' -e 'P;D' | tr -s " " | sed -e 's/<[a-z0-9 /"=]*>//g; s/^M//g; s/: ::/ ::/g;' | sort | uniq > top35.txt

Apologies if it comes out like crap or doesn't work for anyone. The special characters may or may not appear properly in the browser. If you run it and it doesn't work, give me a yell. Or, just look at the finished list.

Search result sifting using compression algorithm

(, , , , , — )

10 Jul. 2007

Clever way to sift standard search results (ie google) against a known good result, thereby increasing relevance of final results.

Wonder if it would be possible to build front end to perform actions in one step - type in search, paste known good, hit go. Automatically calls script and performs actions in one swoop.

On Digital History Hacks, of course.

KTRU Complete top 35 lists

(, , — )

7 Jul. 2007

How to compile a single file of KTRU top 35 playlists for all time, in less than thirty minutes:


wget -O ktruraw.txt -w5 -r -l1 -IOLDLIST --no-parent http://bang.rice.edu/top35archive.shtml


sed '/<.*>/d' ktruraw.txt | egrep "(/|-| )" | sed 's/ - / :: /g' | sed 's/ / / :: /g' | sed 's/ / :: /g' | sed 's/^[^0-9]/1 :: &/g' | cut -d ":" -f3- | tr -s " " | tr "[:upper:]" "[:lower:]" | egrep "^ [a-z]" | sort | uniq > ktru-top35

Not perfect - some repeats where albums were entered with spelling or typing variations.