8 Feb 2012

Review: Mining the Social Web

I wish I had started with this book when I first got interested in machine learning: it's fun and gives just enough background to try things out without information overloading. There are always references to further explore things out in case you wish - and I did so, with text information retrieval, which is a good sign on how well the book did on introducing the topics at hand. Great chapters are the ones on analysing email (ch 3), blogs (ch 8. Natural language processing and beyond) and Twitter (ch 5). Chapter 7 on Google Buzz is unfortunately outdated - it would be great if a new chapter on Google Plus would be released for new buyers.

Each chapter is light on detailed theoretical explanations but the code samples are just fun to run. And they're based on python, which has a soft spot in my heart :) The author's style is very easy, which makes the ride that much better.

QuickMining the social web is a great way to be introduced to the variety of ways data so freely available on the web can be analysed and visualized - it's sort of mind-opening what can already be achieved with so little effort. Good book for beginners.

 

3 Feb 2012

Reset database before phpunit tests in Symfony 2

Here's a way you can reset the database before running your phpunit tests.

In app/phpunit.xml change the bootstrap file we're going to use:

   bootstrap                   = "my_phpunit_bootstrap.php" >   

In my_phpunit_bootstrap.php we import the old bootstrap and add our code to recreate the database and load the fixtures (you could also do this inside your tests):

I also set the test database as pdo_sqlite for faster tests.

Here are some other links I found useful before settling into this solution:

Fully isolated tests in symfony 2

What is the best way to create relationships with functional tests

Using doctrine entity manager in symfony2 unit tests

 

10 Oct 2011

vim clear command to delete all buffers

just add to your .vimrc the following

"" type :Clear to remove all buffers
command Clear bufdo bdelete

just easier to remember
4 Oct 2011

python script to recursively replace string in filename (batch rename) and contents

5 Sep 2011

apache htaccess to iis web.config, an example

Here's an example of an htaccess and its more or less equivalent web.config for iis

Apache's .htaccess:

Web.config:
10 Aug 2011

gotta love ack

gotta love ack

no better tool to find code snippets in a directory. pretty pretty

6 Aug 2011

install node and npm in macosx snow leopard

It seems "configure is currently broken for some versions of MacOS" as stated here so the only way I could install it was without openssl. Unfortunately this means that when installing npm warnings are generated but otherwise everything seems ok.
9 Jul 2011

finding the slowest queries in mysql server using mysqldumpslow

First enable the slow query log by editing my.cnf (normally it's at /etc/my.cnf) and adding the below (in bold):

[mysqld]
.
.
# set the path where the slow-log will be placed
log-slow-queries=/var/lib/mysqllogs/slow-log 
# log all queries who take longer than one second to complete
long_query_time=1
log-queries-not-using-indexes
.
.
[mysql.server]

Then restart the mysql server:
$ /etc/init.d/mysqld restart

Now give it some time to collect usage information and then run:

$ mysqldumpslow -s c -t 20 /var/lib/mysqllogs/slow-log

Notice the path to the log is the same as what's in the my.cnf file. Here we're getting the top 20 queries (-t 20) sorted by the number of occurrences in the log (-s c). Don't be scared if you see a bunch of S's and N's, without the -a option all numbers are abstracted to N and strings to S.

$ mysqldumpslow -a -s at -t 10 /var/lib/mysqllogs/slow-log

Here we're getting the the top 10 queries that took the most time to complete.
27 May 2011

vim time tracking

Just built my first vim plugin: line-log.vim. It sends the current line in the buffer to a log file, flagging if it's the start or end of the task at hand. The log format is in csv format for easy report generation and borrows heavily from activity-log vim plugin.

Working together the plugins generate log files in this format:

date time ; action ; path/to/file ; git branch; current line

So we could have:

2011-05-28 02:20:08;start_task;/Users/sofia/vimwiki/index.wiki;;- Build a vim script to log current line to log in same format as activity-log.vim
2011-05-28 02:44:11;end_task;/Users/sofia/vimwiki/index.wiki;;- Build a vim script to log current line to log in same format as activity-log.vim
2011-05-28 02:44:01 open /Users/sofia/vimwiki/index.wiki
2011-05-28 02:44:54;write;/Users/sofia/vimwiki/index.wiki;master
2011-05-28 02:45:14;write;/Users/sofia/vimwiki/index.wiki;master

Both git-branch and current line text are optional. Activity log automatically logs each time you create, open and save a file. 

As for the why, a quote from the original developer of activity-log:

The main purpose of this plugin is to make time reporting easy for the forgetful programmer, without additional tools it provides a static log for you, but the format is such that you can easily import or analyze using whatever tool you feel comfortable with.
19 May 2011

Use Coda's Panic Font in vim

Go to Coda's Application, right click to Show Package Contents, in Contents->Resources double click on Panic Sans.dfont and install.
In vim, go to the console to edit ~./vimrc (or ~/.vimrc.local):
:e ~/.vimrc
and add:
set guifont=PanicSans:h13
Save and either quit and relaunch vim to see the changes, or reload the config using :so % (which basically sources the current buffer's contents). And by the way another great - open-source - font is Inconsolata.

Update June 20 2011 on good programming fonts

Menlo is relly nice as well. Found it because it's the font used in the console in Snow Leopard.

sofia cardita's Space

developer at naveinteractions.com. curious and self-learner. love: web development, php, python, appreciating the strange and dangerous genius of javascript.