Entries tagged with “unix” from Garbage In, Garbage Out

scandisk.pl

| | Comments (0)

This script will read a file (or raw character device, such as /dev/rdisk0 on a mac), looking for read errors. It tries to work in large chunks to reduce the system call overhead, but will bring itself down to 512 byte blocks to accurately report how far into the (file or disk) the read errors are happening. It also reports on slow reads (>3 seconds).

http://gigo.com/ftp/pub/src/scandisk.pl

Cache CGI for offloading content to cheap host

| | Comments (0)

The good news, is that I painted my motorcycle.

The bad news is, I posted a url to the pictures, and other folks are inline imaging that photo on various forums, and it is risking my colo bandwidth bill into overage.

The good news, is I've written a cache to offload these requests to, to run on a cheap shared hosting environment, where they bill by total bandwidth per month instead of on semi-peak usage.

index.cgi

Edit the file to suit your needs. Be sure to edit the part where you indicate what paths you are willing to cache and serve - this was meant to only serve the data I wanted to serve, and not to be the latest slashdot abuse host.

An example of the script in action:

http://cache.gigo.com/index.cgi/gallery.gigo.com/jfesler/st1300/.lowres/aag.jpg

index.cgi is the script itself. The path before it is where I placed it on a cheap host. The path after that, is the web url I want mirrored.

You can set the parameters on how often it should refresh a mirror. If there were no changes (and the source is not dynamic), then the mirror should be cheap to refresh.

There is no cache cleanup script (yet). I'll post that here, when I've gotten around to writing that. :-)

udpcast

| | Comments (0)

I have the need to replicate specific blobs of data across basically the entire organization.. often. In the past we've been doing some fairly smart yet still unicast methods of distributing the data. However, the sheer number of machines is making our current algorythm feel the pain. The obvious answer is multicast, or at least, broadcast, the data.

It seems that multicast tools for file replication are still at a stage of infancy. After digging around, I did find a tool that came close to what I needed - a tool called "udpcast". First, strengths: 1, it seems to work (!). 2, it is simple to implement, and folks know I happen to like simple. 3, license is friendly for my work environment.

It has FEC built in (forward error correction), so we can simply spew it into the broadcast ethernet network and anyone who's listening, can take it. If they mangle or drop a packet, they are still likely to get the entire package, due to the built in redundancy. And, for our purposes, if we don't get it, well, it is not the end of the world. We'll get it next time.

On the downside, it seems to have no security paranoia. It has no resource limits for maximum size of the output file, and handles timing out poorly. It also moves just one blob of data, so you're likely to pass a tarball or something around. Given that there is no authentication and any host on the net can blast packets, one has to add in data validation.

Overall, though, this tool does seem to show promise...

Newer greylisting daemon for Postfix

| | Comments (0)

http://gigo.com/ftp/pub/src/postgreysql.pl

for postfix, frankensteined from various bits of code to make this only require *one* instance running (not several). uses DBD::SQLite, a free local disk based SQL engine (CPAN will install all you need, no daemons needed). This is what I'm currently using as it is lower weight on the server.

Btw, sqlite rocks. It is worth looking at. You may not have a use for it today, but you may in the future. It is free, bundleable, requires no server, default now in PHP; and easy to install for perl (just install DBD::SQLite, done!).

xtail.pl - tail multiple files at once

| | Comments (2)

http://gigo.com/ftp/pub/src/xtail.pl

This rocks. That's all I can say.

This can follow when log rotations have rotated the log from under you; It can also watch an entire directory for any changes, new additions, etc.

Whitelisting via your imap folder

| | Comments (0)

http://gigo.com/ftp/pub/src/find-email-in-sentmail

Scans your imap folders, returns exit code based on if the sender address is found or not. used for whilelisting rules in procmail. I use this to automatically whitelist anyone in my "sent-mail" folder, or anyone in my "whitelist" folder. This scans my imap folders at delivery time, so changes are 100% realtime and automatic. I suggest this only for efficient imap servers, like Cyrus.