0plus1 Blog

Uncategorized

Facebook Hacker Cup 2011 Qualification – Studious Student

by admin on Jan.09, 2011, under Uncategorized

Here we go with another problem. Here’s the test

Studious Student

You’ve been given a list of words to study and memorize. Being a diligent student of language and the arts, you’ve decided to not study them at all and instead make up pointless games based on them. One game you’ve come up with is to see how you can concatenate the words to generate the lexicographically lowest possible string.

Input

As input for playing this game you will receive a text file containing an integer N, the number of word sets you need to play your game against. This will be followed by N word sets, each starting with an integer M, the number of words in the set, followed by M words. All tokens in the input will be separated by some whitespace and, aside from N and M, will consist entirely of lowercase letters.

Output

Your submission should contain the lexicographically shortest strings for each corresponding word set, one per line and in order.

Constraints

1 <= N <= 100
1 <= M <= 9
1 <= all word lengths <= 10

Basically we have to find the composed string that is the lexicographically‘s shortest. So where is the issue here? The issue is that the problem states that you have to find the shortest possible composed string, so order each of the words cannot be simply ordered and combined because that may or may not be the correct solution (we will go there later). The easiest way to do this is to recursively test each of the possible permutations of the words against each other, this is the bruteforcing method. Let’s see.

Bruteforcing Method

First we have to parse the input:

function fromTxtToArray($filename){
    $fh = fopen($filename, "rb");
    $data = fread($fh, filesize($filename));
    fclose($fh);

    $array = explode("\n",$data);
    $rows = $array[0];
    unset($array[0]);

	$i = 0;
	$return = array();
	foreach ($array as $k => $v){
		$return[$i] = explode(' ',trim($v));
		unset($return[$i][0]);
		$i++;
	}
    return array('rows' => $rows, 'data' => $return);
}

Then we have to permutate each of the words we have been given and then sort them (php sort luckily automagically sorts strings lexicographically) returning the first one.

//source of this function -> http://docstore.mik.ua/orelly/webprog/pcook/ch04_26.htm
function permute($items, $perms = array()) {

	global $permutations;

    if (empty($items)) {
        $permutations[] = join('', $perms);
    }  else {
		for ($i = count($items)-1; $i >= 0; --$i) {
		     $newitems = $items;
             $newperms = $perms;
             list($foo) = array_splice($newitems, $i, 1);
             array_unshift($newperms, $foo);
             permute($newitems, $newperms);
         }
		 return $permutations;
    }
}

function sortAndKill($array){
	sort($array,SORT_STRING);
	return $array[0];
}

$data = fromTxtToArray('input.txt');

set_time_limit(0);

foreach ($data['data'] as $k => $v){
	$permutations = array();
   	$combinations = permute($v);
	echo sortAndKill($combinations)."\n";
	unset($permutations);
}

This works perfectly with the example, sadly with the real output it will take too long (I let it run for more than 20 minutes with no result). If you wanna test it just copy and paste the test input into a txt file.

5
6 facebook hacker cup for studious students
5 k duz q rc lvraw
5 mybea zdr yubx xe dyroiy
5 jibw ji jp bw jibw
5 uiuy hopji li j dcyi

It’s obvious that we have to switch to dp.

Dynamic Programming

What is the problem here? Is that we have too many cases to test, we have to find a way to exclude obviously wrong cases. We can start by ordering the words we have been given and see the results.

function lexiOrder($array){
	sort($array,SORT_STRING);
    $string = false;
    foreach ($array as $k => $v){
        $string .= trim($v);
    }
    return $string;
}

$data = fromTxtToArray('input.txt');

$result = false;
foreach ($data['data'] as $k => $v){
    $result .= lexiOrder($v)."\n";
}
echo $result;

It works! No wait a minute..

The string: bwjijibwjibwjp is wrong it should be: bwjibwjibwjijp this is because while ji comes before jibw merging them togheter will return jijibw while the correct one is jibwji (b comes before j) the standard sorting will not work anymore.. we have to use usort.

function letterwalk($a,$b){

    if ($a == $b) {
        return 0;
    }
    for ($i=0;$i<10;$i++){

        $lettera = substr($a, $i, 1);
        $letterb = substr($b, $i, 1);

		if ($lettera < $letterb){
			return -1;
		} elseif($lettera > $letterb) {
			return 1;
		}
    }
}

//usort($array, "letterwalk");

So what is this? It’s a function to walk and compare sequentially every single letter of the word against each other. But wait we still get the same result as before! This is because comparing ji agains jibw will fail because b will get compared to null. To avoid this we have to edit the function:

function letterwalk($a,$b){

    if ($a == $b) {
        return 0;
    }
    for ($i=0;$i<10;$i++){

        $lettera = substr($a, $i, 1);
        $letterb = substr($b, $i, 1);

        if($lettera && $letterb) {
            if ($lettera < $letterb){
                return -1;
            } elseif($lettera > $letterb) {
                return 1;
            }
        } else {
            if($lettera==$letterb){
                return 0;
            } elseif ($lettera < $letterb) {
                return 1;
            } else {
                return -1;
            }
        }
    }
}

Finally the example works! But we are not finished yet. The problem is that there may be case where a shorter string must go AFTER the longest one.
to clarify:

ji / jibw

jibwji is the correct answer, so ji must go after jibw.

but

aa / aabb

aaaabb is the correct answer, aa must go before aabb

How do we test this? We can sort them in two different ways! So here we have the final solution:

function lexiOrder($array,$mode){

	if ($mode == 'letter'){
		usort($array, "letterwalk");
	} elseif ($mode == 'word'){
		sort($array,SORT_STRING);
	}

    $string = false;
    foreach ($array as $k => $v){
        $string .= trim($v);
    }
    return $string;
}

$data = fromTxtToArray('input.txt');

$result = false;
foreach ($data['data'] as $k => $v){

	$order_l = lexiOrder($v,'letter');
	$order_w = lexiOrder($v,'word');

	$order_f = array($order_l,$order_w);
	sort($order_f,SORT_STRING);

	$result .= $order_f[0]."\n";
}
echo $result;

BUT!! What about if we find a case with two couples like this:

ji jibw
aa aabb

The correct order would be:

aaaabb jibwji

Since our code now can only test two cases (the shortest before and the shortest after) should we find such a case the algotithm will fail since it will only have:

aaaabb jijibw
aabbaa jibwji

to test, neither of them are correct. We are takling the problem from the wrong perspective. We need to be smart. Let’s analyse the strings.

ji jibw -> jibwji Because the j comes after the b
aa aabb -> aaaabb because the a comes before the b

So..

ji jibw -> jiJ jibw
aa aabb -> aaA aabb

We also know that words are between 1 and 10 chars so we can safely pad them to

ji jibw -> jiJJJJJJJJ jibwJJJJJJ
aa aabb -> aaAAAAAAAA aabb
AAAAAA

aaaabb jijibw
aabbaa jibwji

This way the strings can be sorted with a simple php sort!

function padAndOrder($array){
	foreach ($array as $k => $v) {
        echo $strings[$k] = str_pad($v, 10 + strlen($v), substr($v,0,1));
    }

	sort($strings, SORT_STRING);

	$return = false;
    foreach ($strings as $k => $v) {
        $return .= substr($v, 0, -10);
    }
	return $return;
}

Once again enjoy! I’m doing this to share my method (and the way I’ve come to it) so that it may inspire other, there are maybe thousand of better ways to do this, it’s just my take. Peace.

Leave a Comment more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Archives

All entries, chronologically...