Downloads

 

Twitter API Fun

The other day I was looking through the twitter API docs trying to get inspiration for a side project. I was thinking about creating my own twitter library for PHP, but I wanted to do something quick and dirty, just to get my feet wet, and to see how others have implemented the API in PHP.

If your using twitter and are reading this blog I’m sure you are aware of hashphp. If not, it is a twitter account which grabs most of the tweets with the hashtag ‘#php’ and re-tweets the messages. Since this seemed like a fairly simple thing to do I decided to create my own twitter feeder. Again, I was just hacking together a script, so if you use the code presented you will want to make some enhancements before letting it into the wild. (Please see end of post.)

First, I decided to pick a hastag which was used in the range of 3 – 5 times every 15 minutes; ‘#mysql’ seemed to fit the bill.

Now I needed a twitter account to re-tweet the messages, and a bit.ly account to create short URLs to link back to the original account that posted the message. I created a ‘poundmysql’ account (The number sign ‘#’ is sometimes referred to as the ‘pound’ sign in the US), then I went to bit.ly and signed up for an account.

Continuing to gather the pieces of the puzzle, I looked through the published PHP libraries on twitter and found one which did exactly what I needed. There were other more robust scripts, but I only needed to update an account status so grabbed the package Twitter by Felix Oghina. The next piece came in the way of the Bitly class, written by Ruslanas Balciunas. (I apologize to Ruslanas Balciunas but my current character set in MySQL will not allow for a proper spelling of the last name.)

Next, I needed to be able to retrieve all the tweets which contained the string #mysql. Twitter offers a Search API which would do the job, but for my purposes search.twitter was the faster option. They offer an XML feed to any search so I only needed to search #mysql and grab the URL of the feed, which is:

http://search.twitter.com/search.atom?q=%23mysql

As you can see the search term is contained in the query string of the URL (URL encoded), which is useful since I could then setup the script to accept any search term.

OK, now with all the pieces of the puzzle at hand I only needed to write the code to fit the pieces together creating my own twitter feeder.

First I grabbed a function which I used in a long ago project [probably also swiped from the internet:)]. The function sends a HTTP request to a server and returns the response.

function url_get($domain, $uri, $referer='')
{
    $header = array();
    $header[] = 'GET '.$uri.' HTTP/1.1';
    $header[] = 'Host: '.$domain;
    $header[] = 'User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;rv:1.8) Gecko/20051111 Firefox/1.5';
    $header[] = 'Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5';
    $header[] = 'Accept-Language: en-us,en;q=0.5';
    $header[] = 'Accept-Encoding: gzip,deflate';
    $header[] = 'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7';

    $header[] = 'Keep-Alive: 300';
    $header[] = 'Connection: keep-alive';

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $domain.$uri);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_ENCODING, "");
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_REFERER, $referer);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);

    $result['exec'] = curl_exec ($ch);
    $result['info'] = curl_getinfo($ch);

    curl_close ($ch);

    return $result['exec'];
}

As you can see the function accepts three parameters, the domain to connect to ($domain), the path to the requested page ($uri) and the referrer ($referer). This function uses the cURL extension, if your PHP configuration has allow_url_fopen turned on you don’t even need this function (more on that in a minute).

Next I setup the $domain, $uri and $search_term variables, and made the call to the function to get the XML from search.twitter.


//ideally you would put this in a configuration file
$search_term = '#mysql';

//create variables to be sent to the function
$domain = 'search.twitter.com';
$uri    = '/search.atom?q='.urlencode($search_term);

//retrieve the xml response string from search.twitter
$feed = url_get($domain, $uri);
 

Now that the script has the information needed from twitter as an xml string, I can then create a SimpleXML object using the following:


$xml = new SimpleXMLElement($feed);

If allow_url_fopen is set to ‘on’ you could skip directly to creating the SimpleXML object by using the URL as the first parameter and setting the third parameter to TRUE:


$feed = 'http://search.twitter.com/search.atom?q='.urlencode($search_term);

$xml = new SimpleXMLElement($feed, NULL, TRUE);

The tweets in the feed are ordered by the time they were originally posted using the UTC timezone, since this is the case I needed PHP to use UTC when calling date time functions. To do this I used the date_default_timezone_set function.


date_default_timezone_set('UTC');

Next I included the Twitter class and the Bitly class into the script and created new instances of each:


//require the two files which define the classes used
require_once('Twitter.class.php');
require_once('bitly.class.php');

//setup log in information for twitter and bit.ly
//again these should really be kept in a separate config file
//outside the document root
$twit_uname = 'twitter_username';
$twit_pword = 'twitter_password';

$bit_uname = 'bit.ly_username';
$bit_api   = 'bit.ly_api_key';

//instantiate new objects
$twitter = new Twitter($twit_uname, $twit_pword);
$bitly   = new Bitly($bit_uname, $bit_api);

The idea is to loop through the messages received from the search and update the twitter account with any message that was posted within the last 15 minutes. I setup the last run time of the script outside the loop so that it only needs to be calculated once.

//define the number of minutes between each script run
//again this is a setting which should be in the config file
$time_interval = 15;

//get the number of seconds since the UNIX epoch
//and the last script run
$last_run_time = mktime()-($time_interval*60);

The messages can be found in $xml->entry. Within the loop the first check I needed to make was to determine if the twitter feeder account (poundmysql) had posted the message and if so, skip that message.


//setup messages loop
foreach ($xml->entry as $update) {
    //skip any messages set by poundmysql
    if ('poundmysql' == $update->author->name) {
        continue;
    }
...

Next I checked for the time the message was posted and if that time occurs later than 15 minutes ago, I could end the loop and the script itself, since any entries after this would also occur after the 15 minute cutoff point.

...
    //time format in xml string: 2009-07-25T20:32:13Z
    //split time from date
    list($pub_date, $pub_time) = explode('T', $update->published);
    //remove the ending 'Z' from time
    $pub_time = substr($pub_time, 0, -1);

    //get time segments for mktime() call
    list($hour, $minute, $second) = explode(':', $pub_time);
    list($year, $month, $day)     = explode('-', $pub_date);

    //get the number of seconds since the UNIX epoch
    //and the time the message was posted
    $publish_time  = mktime($hour,  $minute, $second,
                            $month, $day,    $year);

    //check if message was later than last run time,
    //if so end loop
    if ($publish_time < $last_run_time) {
        break;
    }

If the message passes these two checks, I know that it will be used in the feeder.

I then create a bit.ly link for the original message ($update->link[0]['href']), and store the character length of the link. Then I grab the content of the message ($update->title) and store that character length.

    //grab url of the original message, create short link, store length
    $link       = (string)$update->link[0]['href'];
    $short_link = ' ..'.$bitly->shortenSingle($link);
    $short_len  = strlen($short_link);

    //grab tweet content store length
    $content     = (string)$update->title;
    $content_len = strlen($content);

Since Twitter has a max character length of 140, I then needed to check what the character length of the new message was with the bit.ly link added, if it was longer than 140, I then needed to cutoff a section of the original message to accommodate the 140 maximum length.


    //total message length
    $len_total = $short_len+$content_len;

    if (140 < $len_total) {
        //determine how many characters over 140
        $over = $len_total-140;

        //remove that many characters from the original message
        $content = substr($content, 0, -$over);
    }

The only thing left to do was to actually update the twitter status.


    $new_status = $content.$short_link;
    //update twitter status
    $twitter->update($new_status);
}
//end loop

That ends the PHP code for my twitter feeder, now I setup a cron job to run every fifteen minutes and I have my own twitter feeder feeding the MySQL community.

*/15 * * * * GET /path/to/script

Since I didn’t want to actually monitor this feed, I let it run for awhile and once I verified everything was working correctly I deleted the twitter account and stopped the script from running.

A copy of the code used in this post can be found here.

Enhancements

Some updates you will probably want to make before actually using this script for a real twitter account:

  • Remove all the configuration options to their own file and store outside the document root.
  • Error checking for unavailable third-party services
  • Reverse tweets before running the loop. ATM: the script re-tweets in the reverse order the original messages were posted.
  • Implement a filtering system so no unwanted messages are inserted into your stream.
  • Update the script to use the twitter search API to guarantee that all tweets with your hashtag are captured.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Improve the web with Nofollow Reciprocity.