leo charre

contact links about resume

dev design near life experience heroes




text to html

March 9th, 2010

I’ve been trying out text2html. It’s pretty cool.
You point it to a text file (or stdin), and it spits out html.
There are some fucked up things about it though..

For one, by default, it does nothing.
Nah, I’m not kidding. It does nothing- spits out same shit that came in. Can you imagine if you ate an apple and shat and apple out, unchanged? That would not be cool. Ok ok.. so it does html entities.. fuck me..
Second- there’s no help option.
Every unix command must have a -h or –help option. Because it’s expected. The empirical *I* fucking expect it.
Instead you have to use $ man ‘text2html’. Oh- but.. wait.. what’s this? Not the complete manual? You have to read $ man HTML::FromText for the full options.
Shit.
This is enough to piss me the fuck off.

It does cool shit, but you have to wine and dine it before it’ll suck your dick. It won’t just take your fifty bucks to do it. And you know the unix way.. By default, this program should suck your dick, no surprises, and shut up.
It should eat my apple and shit out a shit.

Workaround..

Create an alias in your ~/.bashrc file. Add this line:

alias text2html='text2html --blockcode --bold --bullets --email --numbers --paras --tables --underline --urls'

Note that it won’t work until you start another shell session.

Now all options are on, and the thing behaves closer to what is expected.

To install this fine mess of a program… Well.. the application is awesome- the api is a fucking cracked out microsoft whore.
As root..

# cpan HTML::FromText
# man html2text

Here’s some example input and output…

Original Text:

I AM TEXT THAT WILL BE TRANSFORMED TO HTML

Let's see and try what happens here.. shall we.

And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will.


And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. This is to test that the next word
will be wrapped up, and there won't be a break between the 'word' and 'will'.


   - thank you
   - i am also thanked of course
   - you think?

ALSO WHAT ABOUT

   A definition.
      I am goint to think so very much

   And what about this on?
      I will also think that. Thank you.



Great. What about a link? http://leocharre.com

Done.
Html output..

I AM TEXT THAT WILL BE TRANSFORMED TO HTML

Let's see and try what happens here.. shall we.

And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will.

And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. And hopefully what we expect to happen will. This is to test that the next word will be wrapped up, and there won't be a break between the 'word' and 'will'.

  • thank you
  • i am also thanked of course
  • you think?

ALSO WHAT ABOUT

A definition.
   I am goint to think so very much

And what about this on?
   I will also think that. Thank you.

Great. What about a link? http://leocharre.com

Done.


You may want to look at the source of this page for the little details. It’s pretty clean, very nice job. (No, it’s not being trasformed by freaking wordpress.. See wordpress raw html plugin )

help with installing tesseract ocr

February 4th, 2010

# INSTALL.tesseract
# =================
#
# Installing tesseract can be tricky.
#
#
# 1) Some dependencies..
#
#
#
# You’re may need gcc-c++, automake (gnu automake), and svn (subversion).
# You can check if you have these using the ‘which’ command..
# which svn
# which automake
#
# If the command is not present, nothing happens.
#
# If you have ‘yum’ (fedora/rehat) or ‘apt-get’ (debian/ubuntu), you may want
# to simply try:
#
# apt-get install automake
# apt-get install subversion
#
# yum -y install subversion
# yum -y install automake
#
# If this does not work, you need to download the source packages and manually
# install them.
#
# You can get gnu autake from:
# http://www.gnu.org/software/automake/
#
# And subversion from:
# http://subversion.apache.org/
#
# As for gcc-c++ installed on your system- This is likely already present.
# If you’re missing gcc-c++, try using yum or apt-get.
# Here is where to read more about gcc
# http://gcc.gnu.org/
#
# 2) Get the source for tesseract..
#
# You may be able to simply install the SVN version of Tesseract by
# using these commands..

svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
cd tesseract-ocr
./runautoconf
mkdir build-directory
cd build-directory
../configure
make
make install

#
# for more info, see google project on ocr, they use tesseract
#
# you can also try to run these commands as a script ( lines starting
# with a pound sign are comments and ignored bash/sh
# save this text as something like ‘INSTALL.tesseract’ and then run..
# sh ./INSTALL.tesseract

perl command line usage examples aka one liners

January 26th, 2010

Search and replace..

You have a directory with many html files. In them- you have ocurrences of the email address jim@hardwire.com, you need to change it to james@gmail.com..

First, for curiosity’s sake.. What files have this text ‘jim@hardwire.com’ ?

find /home/myself/www/public_html -type f | xargs grep 'jim@hardwire.com'

This will output what files have that text, and where.1

I wanted to replace usage of one module in a test script with another.
The old module is DMS::AP::Base, I want to change the text in the tests to DMS::Base::AP

perl -p -i -e 's/DMS::AP::Base/DMS::Base::AP/g' ./t/*.t

There are many more detailed sources of information on perl one liners.2


1 This is text that explans etc etc. Furthermore etc etc. Furthermore etc etc. Furthermore etc etc. Furthermore etc etc.

2 more on perl one liners

browsing your partition with tree via the command line

January 26th, 2010

These tools are useful for development.

Most of the time you work on the terminal, and to find your way around a project you use things like find, ls, and tab completion.
If you need more of a bird’s eye view, you may fire up a gui browser like konqueror. But that’s a gui, and guis are for users.

Another option is tree. Here is example output of tree:

[leo@localhost devel]$ tree
.
`-- WordPress
    |-- bin
    |   `-- wppost
    |-- lib
    |   `-- WordPress
    |       |-- Base.pm
    |       `-- Post.pm
    |-- t
    |-- wp-content
    |   `-- plugins
    |       |-- akismet
    |       |   |-- akismet.gif
    |       |   `-- akismet.php
    |       |-- hello.php
    |       |-- pictpress.php
    |       |-- pm_admin_menu.php
    |       |-- postmaster
    |       |   `-- readme.txt
    |       |-- postmaster.php
    |       `-- wp-db-backup.php
    |-- wp-mail.php
    `-- xmlrpc.php

9 directories, 13 files

Read the rest of this entry »

editing images in the command line with convert and mogrify

January 26th, 2010

One of the dumbest things I used to do in making web pages was to resize images and make thumbnails in ‘photoshop’.

The next less dumb thing I did was to script thumbnailing. To allow a server to make the thumbnails instantly.
Then I got comfortable with things like convert and mogrify.

Both of these are interfaces to image magick..
Read the rest of this entry »

graph of hard drive usage with filelight

January 26th, 2010

a visual graph of hard drive usage

Filelight is a graphical representation of the storage in your computer. A visual graph of a partition’s space usage.
In slightly more technical terms, it’s a way to see how much of a partition is taken up by porn, movies, music, and all the other junk that wastes your misserable existence.
Read the rest of this entry »

how to get a movie screenshot with bash and mplayer

January 25th, 2010

Imagine you have a movie, and you want to get a screen capture (screenshot, screencap, what have you..).
What should you do? Open the movie, pause and then get a screenshot of our desktop?
Nah…

Use mplayer..

Let’s create a directory to store the screenshot..
$ mkdir /tmp/moviecap

Now let’s tell mplayer go take a screenshot of the movie at 15 minutes and 20 seconds..
$ mplayer -vo jpeg:outdir=/tmp/moviecap/ -nosound -ss 15:20 -frames 1 source_movie.avi
$ ls /tmp/moviecap
Voila. You get a 0000001.jpg file, it’s a movie frame.

Ok. Easy enough.
What if we want to convert the entire movie to frames? So we can select some of the frames we want..
(This will take a while..)
$ mkdir /tmp/allframes
$ mplayer -vo jpeg:outdir=/tmp/allframes -nosound source_movie.avi
$ ls /tmp/allframes
Yeah, you’ll see a few thousand files.
How do we get rid of some of these?
$ for num in 1 3 5 7 9; do rm /tmp/allframes/*$num.jpg; done;
That deleted all images ending in odd numbers.

Great. But we still have all these freaking images named very vague- Let’s rename these suckers..
$ cd /tmp/allframes
$ rename 0 source_movie_ ./0*.jpg

Now you have to view and pick out..
$ gqview /tmp/allframes

Ok. That should make it easier for ya :-)

running remote ssh commands

January 14th, 2010

Holy fuck.. Here’s something cool.. remote commands with ssh.

I like to log into a machine and watch apache error log as I run a stoopid cgi.
I can do this locally..

$ ssh username@hostname ‘tail -f /path/to/logfile’

This starts splurting output just as it would, locally.

vim unix perl development tips

November 24th, 2009

All my work worth using twice is stored as distros inside our version control repository.
We use cvs. It works really well.

Some of the stuff is stored as reference in case something blows up and we need to know what was there. Such as with crontab files, sometimes they have important details- and if we had to recreate from scratch, we could forget one of those important details.
Config files are good too, httpd conf files.. The occasional log file, although rare.

Some projects can get big. So you separate them into smaller projects/module distros with test suite, documentation- and make the hell sure they work really well by themselves, stand alone.

And then some suff- some of it.. Is really better off with a ton of code. It’s rare that you really have an advantage with this- but it can happen.

In that case, maintenance is real work. But I have learned a few things over the years that have sped up and improved my ability to keep up mountains of code and hundreds of distros.
I will list some of these things here. This article will be in constant update and revision.


Look through a ton of code in your distro and open to the exact place where you find "it" via the command line.

I need to find where Cwd::Ext is in my code to change it.

1) [root@wingnut DMS2]# find ./ | xargs grep -s 'Cwd::Ext'
./lib/DMS/User.pm:use Cwd::Ext;
./lib/DMS/User.pm:   require Cwd::Ext; # read in everything
./lib/DMS/User.pm:   my $d = Cwd::Ext::abs_path_nd($abs) or warn("cant get abs path to [$abs]") and return [];
2) [root@wingnut DMS2]# vim ./lib/DMS/User.pm +/Cwd::Ext

Command 1) searched the current directory for all files, all directories.. everything.. and looks inside for the text 'Cwd::Ext'. The -s is error suppression for grep. For example when you run grep on files with spaces this way- you'll get errors. I don't care about that - I just want to look inside files that I know have no funny chars in the filename. This command is actually two commands. The output of find (find ./) is piped (|) to xargs and grep.
If you didn't use xargs, the output you would be filtering 'Cwd::Ext' from would be the file listing from find. When you pass a file argument instead of a text stream to grep, it looks inside the file. Using xargs will pass every list element as an argument to the next command. On unix, non escaped whitespace is a delimiter, spaces, return character, etc.

Command 2) Is much more interesting. It means to open the file in vim- and the + flag means to run the following command. Just as you would in command: mode inside vim environment. What we are saying in this command, is to open the file and search for string. When the file opens, this will result in the current position to be the first line where the text 'Cwd::Ext' appears.
Very fucking cool. Very fucking convenient.


Setting colors of directory listings in bash terminal

This one I underestimated the importance of. But no more.

If you really use vim/perl/unix as I do- you are in bed with the command line. You're looking through a lot of file listings, regularly.

Using most guis, such as kde/gnome on ubuntu/fedora/suse/debian.. You'll notice when you do an ls on a directory- you most likely, by default, get some funny ephemera manifestations. What I mean is, directories may be listed in a different color. Maybe they are bold. And regular files are normal text. And some of them are in different colors!

What's up with that? Do we really give a shit if every jpg is red and every gif is blue, pdfs are purple and then notice also- JPG is not detected as a file ext to highlight and jpg *is*. Try it, change the ext of a JPG file to jpg, the color changes.

What is up with that? Well- I had been pleasantly ignoring that since- forever. I never cared to meddle or alter the settings, or even to look if there was a setting to affect that.

But recently I've been really curious to learn not just more code- more programming- but more of *how to work* more efficiently. I looked up how to affect these color and text formatted listings of directories- hoping that it may be helpful when looking through distros, through filesystems.. And you know what.. it is. Do you remember using vim and then one day actually taking the time to search and replace, properly, with capture.. And.. Wasn't it really fucking useful?

Ok, this one I know you're gonna roll your eyes at. And I did too, for years. But I would like you to take ten minutes of your life to give this a ride and see for yourself that it does help out a little. And the life of the code... These little things add up- using find xargs grep, using cv, perl one liners- and I'm gonna have to vouch for this one too. Directory listing text formatting.

  1. Ok, the thing you are looking to alter is a shell environment variable- on bash it's LS_COLOR.
  2. What is it set at currently? Run set and grep out for LS_COLOR:
    $ set | grep LS_COLOR
    LS_COLORS='no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;35:*.cmd=00;32:*.exe=00;32:*.sh=00;32:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.t=93:*.pm=00;36:*.pod=00;96:*.conf=00;33:*.off=00;9:*.jpg=00;94:*.png=00;94:*.xcf=00;94:*.JPG=00;94:*.gif=00;94:*.pdf=00;91'
    
  3. Alright, wtf just happened. This string tells ls that when ls --color is called, these are the colors/text formatting to use.

    Now, ls --color? When have you run that flag? This is set as an alias in some bashrc file somewhere. Likely not your ~/.bashrc. But in some /etc/$BASHRC?? file. If you want to find out in which exact place you're defaulting to use ls --color ...

    $ find /etc/ 2>/dev/null | xargs grep -s 'alias ls'
    /etc/profile.d/colorls.sh:      alias ls='ls --color=tty' 2>/dev/null
    /etc/profile.d/colorls.csh:alias ls 'ls --color=tty'
    
    Now, how the heck is this being called??? Look at your ~/.bashrc, it may have an entry to include /etc/bashrc if it exists, and then.. well.. there's some other default profile voodoo and such I'm not versed in.
  4. How it works..

    The string value of the LS_COLOR environment variable is parsed as a hash, internally. The stream of garble is understood as.. The delimiter is the color (:) symbol. The first part is what the element of the directory listing is, then assignment (=), and a style code. More than one style code can be assigned.

    For the following chunk of string: di=00;1:fi=00:*.php=00;34, it means:

    di(directory) =(assignment) 00(normal text) ;(and) 1(bold)
    :(delimiter, next entry..)
    fi(file) =(assignment) 00(normal text)
    :(delimiter, next entry..)
    *.php(everything matching *.php) =(assignment) 00(normal text) ;(and) 34(blue)
    
  5. How to change it!

  6. All you need to do to see it change before your eyes, is to go to a command line prompt, and type in:

    $ LS_COLOR='di=00:fi=00;1:*.php=00;34'
    $ ls
    

    You'll notice that makes all regular files appear bold, all directories in 'normal text weight' and default color (black on white, white on black)- and php files are blue (or whatever color is mapped to that slot).

    Now, if you wanted to make that change permanent (instead of only to that one terminal session you just opened)- you would enter that into your ~/.bashrc file .. as:

    export LS_COLOR='di=00:fi=00;1:*.php=00;34'
    
  7. Knowing the codes:

    Please do note.. that's a pretty lousy string to use. It's just an example. You need to play around with it and figure out what you really want. For that, you need to know what the codes are.. for the filesystem elements (no,li,di.fi.. etc), for setting up regexes (*.pm)- and what the codes mean (1 bold, 00 normal, 9 strikethrough). I picked up a good list from Bartman.

    Here's that same list with some extras added- these work properly on GNU bash 3.0.

    0 = default colour
    1 = bold
    4 = underlined
    5 = flashing text
    6 = no change
    7  = reverse field
    8 = black
    9 = strikethrough (cool!)
    10 - 29= no change
    30  = light green
    31  = red
    32  = green
    33  = orange
    34  = blue
    35  = purple
    36  = cyan
    37  = grey
    38 = underline
    39 = no change
    40  = black background
    41  = red background
    42  = green background
    43  = orange background
    44  = blue background
    45  = purple background
    46  = cyan background
    47  = grey background
    90  = dark grey
    91  = light red
    92  = light green
    93  = yellow
    94  = light blue
    95  = light purple
    96  = turquoise
    100 = dark grey background
    101 = light red background
    102 = light green background
    103 = yellow background
    104 = light blue background
    105 = light purple background
    106 = turquoise background
    

My personal LS_COLOR environment variable for perl development- I have refined mine to be:

LS_COLORS='no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;35:*.cmd=00;32:*.exe=00;32:*.sh=00;32:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.t=93:*.pm=00;36:*.pod=00;96:*.conf=00;33:*.off=00;9:*.jpg=00;94:*.png=00;94:*.xcf=00;94:*.JPG=00;94:*.gif=00;94:*.pdf=00;91'

This helps me detect some things that are important to me, such as tests.t to stand out from other junk in t/, a different color for pod and pm files.. etc. very cool.

I've been using this across my servers. And it does help. It's a very small thing to do, to improve your development environment. I suggest you try it out.

move bugzilla to another server

November 12th, 2009

I wanted to move my bugzilla from server wingnut to server thumbscrew.

  1. First I logged into thumbscrew, and I use scp from copy everything..

    scp -r root@wingnut:/var/www/public_html/bugzilla /var/www/public_html/bugzilla
  2. Now I have to get the database..
    I log into wingnut and run mysqldump, -p will prompt for password.
    (My database is casually named.. bugzilla.)

    mysqldump -p -r /tmp/bugzilla.sql bugzilla
  3. Now I go back to thumbscrew and retried the database backup, and then feed it locally.
    scp root@wingnut:/tmp/bugzilla.sql ./
    mysql -p bugzilla < bugzilla.sql
    

    Voila. Done. Unix runs you. Windows is for users.


Linux User