Skip to main content

Programming

Photo Unshredder

Posted in

Instagram Engineering Challenge: The Unshredder
"Your challenge, if you choose to accept it, is to write a simple script that
takes a shredded image in as input and outputs an unshredded and reconstituted image."

So here's the sample image:

We're given the information that the slices are 32 pixels wide each, so I don't actually
address finding slices.

My Solution Process

Disclaimer: I've never really worked with images before and have little to no
knowledge of the research and techniques often used in this area.

The challenge appears to be figure out the best way to put pieces back together.

My first thought was I can sample X pixels from each edge of a slice, grab the RGB
value of each pixel and compare the difference.

The reasoning behind this was RGB is the easiest way I can think of to compare
differences in colors between pixels. Sampling would hopefully make it faster.

I made one critical mistake, instead of comparing pixel to pixel, I calculated
total R,G,B on an edge and compared the sum of an edge instead of a difference.

I didn't understand why that was wrong until the example of a checkerboard was
given. If you had a slice on a checkerboard they would actually be equal on
average but very different on a pixel to pixel comparison.

So, I had to revise my solution to compare pixel to pixel on each slice against
its opposing slices. So left side of slice 1 was compared against right side of
all the other slices in the image. The logic being that the slices with the
least amount of difference between them probably fit together.

Almost. Something is wrong here. The striped building seems to throwing it off.
The striped building has the highest difference value of any left-right pairs.
So my algorithm kept putting it on the side.

How can I make the striped building fit together? I tried cheating and seeded the
proper right end image (slice 10) and the image constructed itself perfectly.

Well, the only solution I came up with was somewhat brute force in nature. What
if I try compiling the image with every slice as an edge and see what the total
computed difference of the image is?

Voila! The only way I could think to overcome this was proving the whole image turned
out better despite the high difference pair.

My code is publicly available on Github

I apologize if you actually read the code, it's a mess, there is a lot of testing going on, commented out code and things that aren't used in the final version. I thought it would be a good way to learn about ImageMagick and PHP's Imagick class (which is the worst documented thing I've seen on PHP.NET)

Writing Clean Code

Posted in

Great video about writing clean code (which sadly cannot be embedded): http://vimeo.com/12643301

I thought I was doing an ok job but it really shined light on some things I could do to improve my code.

Funny thing is, I could have gone there but didn't think I would be writing that much code in the future at the time. It was 15 minutes away in Malmo!

Natural Language Processing (Comic)

A stressful day trying to work with NLP leads to things like this.

Natural Language Processing Comic by Kevin Ohashi

Dear Afternic

You are still emailing me lost passwords in plaintext. This just isn't acceptable.

I contacted you, worked my way through your support team until the manager I spoke to who was supposed to be connected to the dev team asked me what email client I used and said maybe it was outlook that was revealing my password. My email client (oh, I don't even use outlook) was allegedly cracking the passwords or something. I am not even sure what they were trying to say or imply. Whatever it was, it's ridiculous.

I only noticed this because I reactivated an old account because I thought listing with you guys would be a good idea to complement listing on sedo since you were also free now. I want to be your customer. I also want you to treat my information with respect and keeping my password secure is something I simply cannot compromise on. Please fix this issue so we can get back to selling domain names, because I simply won't do business with you until you do.

Open Sourcing DomainToad.com Domain Name News Aggregator

If 500 people subscribe to the DomainToad newsletter by April 1, 2011.

Oh look, a catch! Shocking!

Why would I do that? Because I spent a fair amount of time and some money to create the website and I would like to see that people at least try it. Releasing the code publicly will also take some more of my time to clean it up a bit and package it. If nobody cares enough or finds it useful enough to use, then I won't bother spending my time to release it to everyone for free.

I've released a lot of software I've written over the years for free or provided access to them publicly for free. Most probably never got used by anyone but me, if people are genuinely interested in having a copy of the code I wrote for Domain Toad they will subscribe for the newsletter and get their friends to subscribe as well. It's free and provides headlines from major domain blogs in your email daily. Other websites charge for that 'luxury.'

If the newsletter gets 500 subscribers before April 1, 2011 I will release the code with the open source MIT license. How will people be notified? I will post on my blog and email the newsletter subscribers a notice if it has 500 subscribers by April 1, 2011.

If someone is really that desperate to buy it, I will sell copies for $250 as is with 1 hour of support to get started with a non-exclusive license to use/modify it but not distribute unless it becomes open source. Contact me.

My Internal SEO Checklist

I compiled a list of everything I know of that you can do internally (on your own webpage) to improve your SEO and rankings. I have broken it down into categories: header, content, navigation, graphic/image, and other. The header stuff is inside <head> tag. Content is about your actual text content. Navigation is structural/linking. Graphic/Image is about <IMG> and when to use it. Other is everything else.

This checklist does not tell you exactly how to resolve your specific seo issues. It is designed as a reminder about all the things you can do to try and improve your rankings.

Feel free to add anything you feel is missing for internal SEO and I would be happy to add it.

Header Related

  • Title Tags – Proper <title> tags with relevant keyword on each page.
  • Meta Description – Describe content of the page briefly (155 chars according to seomoz).

MagpieRSS Encoding Problems

Posted in

I launched a blog aggregation website using MagpieRSS recently. I had all sorts of strange errors with encoding.

I had strings that had black diamonds with question marks inside of them.

I had strings that looked like: Today?s president obama said ?blah blah?

I tried (and wasted a lot of time) debugging my own code for far longer than I care t admit.
Functions to replace certain patterns in strings. Encoding and decoding inputs. etc.

What solved the issues were patching MagpieRSS.

I changed MAGPIE_OUTPUT_ENCODING and MAGPIE_INPUT_ENCODING in rss_fetch.inc to 'UTF-8'

Around line 358 you should see the lines I crossed out and replace them with UTF-8 versions:

if ( !defined('MAGPIE_OUTPUT_ENCODING') ) {
define('MAGPIE_OUTPUT_ENCODING', 'ISO-8859-1');
define('MAGPIE_OUTPUT_ENCODING', 'UTF-8');
}

if ( !defined('MAGPIE_INPUT_ENCODING') ) {
define('MAGPIE_INPUT_ENCODING', null);
define('MAGPIE_INPUT_ENCODING', 'UTF-8');
}

The only problem left was converting all the content to proper UTF-8 format.
Also make sure your database is using UTF-8 I chose general, but choose what suits you.

function encodeutf8($string){
return htmlentities(html_entity_decode($string,ENT_QUOTES,'UTF-8'),ENT_QUOTES,'UTF-8');
}

I html_entity_decode first because if anything in the text is already encoded (and many RSS feeds are)
htmlentities will encode the ampersand indicating it's a special character. So you run into issues
with ’ becoming &8217; which displays '’' instead of a single quote '.

If you still encounter problems, adding a header in PHP to the page displaying content also helped remove
strange characters and fixed encoding issues.

header('Content-Type: text/html; charset=UTF-8');

That's all the tricks I've picked up so far trying to make MagpieRSS function properly. Hopefully that
saves others some pain and frustration. Feel free to leave any other tips or ask questions in the comments.

List Manipulator - Tool for Manipulating Lists of Data

I just wanted to introduce my latest tool, List Manipulator.
Basically, it allows you to quickly edit and format lists.

The basic functions:
filtering (alpha, alphanumeric, domain names)
matching (only contains X or does not contain X)
replacing (find X replace with Y)
prefix/suffix adding to all items

Why did I build this?

I find myself often working with lists of data:
keyword lists, domain name lists, HTML lists, repetitive code, etc.

I always do it manually or write a script each time to fix my current problem.
This time, I decided to build a tool to solve most of the general problems I
run into when formatting and filtering the data. Hopefully it saves me (and others!)
some time in the future.

If you have any feature requests, bugs or suggestions please contact me or comment!

Project 5 - News Aggregator

Yesterday I built the beginnings of a news aggregator.

What it does:

  • Aggregates RSS feeds to database
  • Each feed has customizable title, image (forced size 100x100)
  • Displays RSS items sorted by date (newest first)
  • Almost MVC - design and content are relatively separate but not fully. I didn't use a PHP Templating Engine, it's hard coded with functions that do output some HTML
  • Sitewide variables done via static PHP include.
  • Shows Comments (Disqus)
  • Allows Commenting on article (Disqus)

What it doesn't do (yet):

  • Pagination - only showing 10 items, no way to scroll back further
  • Search - no search implemented yet
  • Channels - view all items posted by a certain blog
  • Admin Interface - Adding new sites, editing is all done directly via MySQL.
  • Magic Tricks - Illusion Michael. A trick is something a whore does for money... or candy.

Burnout

Just couldn't get much done today. Wasn't very focused, lots of distractions. Holiday season is starting to kick in as the family returns home.

I fixed some bugs with Buzz Scanner. The basic problem is Google Charts only allows a maximum URL length, so you cannot pass infinite amounts of data in to graph.

The solution: a method to add all adjacent units of data together:
3,6,3,8,9,6,3,4,5,0,6 becomes 9,11,15,7,5,6 (if it's odd, just leaves last one)

It expects data already in the URL format for Google Charts API in the chd field.

Sample expected input:
t:0|18,10,15,7|0|14,13,14,14
Sample expected output:
t:0|28,22|0|27,28

The PHP code:

function mergeData($dataString){
    $chdTempArray = explode("|",$dataString);
    for($c=0;$c<count($chdTempArray);$c++){
Syndicate content