Dual-license your content for inclusion in The Perl 5 Wiki using this HOWTO, or join us for a chat on irc.freenode.net#PerlNet.
Melbourne Perl Mongers/8th March 2006
From PerlNet
Contents |
Leif's Talki (A 'talki' is a wiki-ish talk)
Background
Leif has been working with a lot of basic algorithms recently, and would like to be able to share the joy with everyone else. That's the idea of this talki.
Encoding
Tonight's talki is on encoding. When people think of encoding, they often think about compression. However encoding is just trying to represent data in a different format. A lot of historical encoding comes from trying to get things to 7-bit ascii so they can be sent across various formats.
Endianness
When working with low-level algorithms, we need to worry about endianness. Endianness determines how bytes are encoded in each word.
TIFF uses a two-byte magic number ('II' or 'MM') to indicate
big or little endianness. The mnomnic for remembering these
is 'II' is used on Intel architectures (little-endian), and
'MM' is used on Motorola architectures (big-endian).
The reason TIFF bothers to tell us the endianess of the "TIFF-writer" that created the file is because it needs to tell the "TIFF-reader" the endianess of the values it stores.
An example - the "T" in "TIFF" is for Tag - A tiff file has a basic hash of data in it, where each key of the hash is called a tag. Tags are numbers, and the TIFF spec tells us what each number means - e.g. 25 means Image Width, 26 means Image Height.
According to the TIFF spec, the value stored in the Image Width tag is a Long 32 integer. And that 32 bit integer is stored in the endianess of the magic number refered to above.
Lets say our Image is 1245457181 pixels wide. Different endianess values will result in that being stored in memory in different ways.
Now 1245457181 is hex 0x4A3B2C1D - a nice hex value, because the number parts decrease and the letter parts increase - so we can see the byte and nibble orderings for different endian schemes.
As an example, the value 0x4A3B2C1D would be encoded as:
Little endian: 1D | 2C | 3B | 4A
Big endian: 4A | 3B | 2C | 1D
If we draw a downwards growing stack these would be
Little Endian
base byte -> 1D -> 0001 1101 base + 1 -> 2C -> 0010 1100 base + 2 -> 3B -> 0011 1011 base + 3 -> 4A -> 0100 1010
Big Endian
base byte -> 4A -> 0100 1010 base + 1 -> 3B -> 0011 1011 base + 2 -> 2C -> 0010 1100 base + 3 -> 1D -> 0001 1101
PDP-11 uses a "middle-endian" architecture. Middle endianess is where the endianess changes for different chunks of information. In the case of the PDP-11, the 16-bit chunks of a 32 number are big-endian, and the 8 bit chunks in each 16-bit chunk are little-endian. A diagram might help.
For the number 1245457181, which has hex encoding 0x4A3B2C1D, we first split this into 16-bit chunks.
Chunk 1
0x4A3B
Chunk 2
0x2C1D
Now each of thes chunks is stored as a little-endian number
Chunk 1
base byte -> 3B -> 0011 1011 base + 1 -> 4A -> 0100 1010
Chunk 2
base byte -> 1D -> 0001 1101 base + 1 -> 2C -> 0010 1100
The final memory layout is
Chunk 1 and 2
base byte -> 3B -> 0011 1011 base + 1 -> 4A -> 0100 1010 base + 2 -> 1D -> 0001 1101 base + 3 -> 2C -> 0010 1100
The Initial and final hex layouts are
Initial 0x4A3B2C1D Finally 0x3B4A1D2C.
The Initial and final bit layouts are
Initial 0100 1010 0011 1011 0010 1100 0001 1101 Finally 0011 1011 0100 1010 0001 1101 0010 1100
So one way to remember this is the bug chunks are big-endian, and the little chunks are little-endian. Any one want to posit why they did this ?
Hex encoding
Hex encoding is often seen in postscript, which requires files to be 7-bit clean.
Hex encoding is basically taking each byte and treating it as its 2-digit hex equivalent. For example, the bit-string 1001:1101 would encode to '9D'. This effectively doubles the size of any data stream, as each byte of input results in two bytes of output. Because the resulting output stream only uses the charcters 0..9 and A..F the result is guaranteed to be 7-bit clean.
In Perl, hex encoding can be done using:
$hex = uc(unpack("H*",$text));
And decoding can be done as:
$text = pack("H*", $hex);
This will produce some sort of answer even for invalid input, so you might like to check that the hex encoding is valid before decoding it:
$hex =~ /^([0-9A-Fa-f]{2})*$/ or die "whoops!";
Ascii 85
Ascii-85 is a way of taking data and transforming it into base-85 numbers, using only 7-bit clean, printable characters. It is sometimes used in postscript and some encodings on early Macintosh platforms.
Ascii-85 encoding is effectively taking a base-256 stream (each byte has 256 different possible values), and converting it into a base-85 stream. Ascii-85 works on four-byte (32 bit) boundries.
The conversion is done by taking the base-256 number and adding 33 ('!') to it and converting to ascii. The exception to this is if the base-256 value is zero, in which case the value is 'z'. The result is five printable characters for each four bytes of input.
Tidyview (Leif Eriksen)
Tidyview was inspired by Damian Conway's book Perl Best Practices. Perl Best Practices makes recommendations on code layout, and suggestions on how perltidy can be used to obtain these.
The Tidyview program allows a user to graphically adjust perltidy switches and see the results of formatting code in a before-and-after format.
Once the user has selected options they desire, they can use Tidyview to emit the options that can be then passed to perltidy to produce the desired results.
Just Added - new colourised diffs. Now you can more easily compare you option choices, as Tidyview now shows cvsweb-like colourised diff's for the original/tidied code. Also, if you have Perl::Signature installed, Tidyview will warn if your selected set of options have munged the parse tree from the original code. This should be a incredibly rare event.
Catalyst (Scott Penrose)
What is Catalyst?
* It's an MVC framework.
* It's a very good one.
* Java Struts for Perl
* Ruby on Rails for Perl
A simple way to develop web applications. Not just web, but mostly used for web.
Catalyst separates code into application flow (controller), processing information (model), and outputting results (view).
The catalyst controller is very web-oriented. Models tend to be more generalised (eg, Class::DBI). Views are written in more standard templating technologies, such as Template Toolkit or HTML::Mason.
Catalyst is not tied to Apache. It performs dispatch based upon URIs and path information.
Catalyst is definitely not a particular application, such as a portal, shopping cart, or content management system. However all these applications can be written in catalyst.
Catalyst provides essential items such as debugging, stand-alone run-time, apache-plugins, and dispatching. It provides a rich set of tools and additions, mostly focused around HTML, AJAX and JavaScript.
Catalyst does not provide a fixed framework. Instead, it's very flexible, allowing any viewer to be used, and any model to be used. The downside of this is that there's no definition of commonly used concepts such as a user.
Installation:
perl -MCPAN -e 'install Task::Catalyst'
Works on Mac, Windows, and *nix.
As a simple example, let's examine an address book. A simple list of names and phone-numbers. We want to search, edit, insert, and delete items. We also want mutliple outputs for use in various ways.
Our first step is to produce a model. Our model should be developed indepedently from Catalyst, so it can be used in other applications (command line, other frameworks). Test it separately, and release it onto CPAN so others can create it. Better yet, download one from CPAN that's already been created for you.
Once your model has been established, we then use catalyst helper scripts to get us started. If our application is called InfoAssistant, then we can begin by invoking:
catalyst.pl InfoAssistant
We can even run our new application (freshly created) by running:
./script/infoassistant_server.pl
this starts the application running on port 3000. We can then use a web-browser to view our new application, which contains some somely formatted boilerplate.
We can flesh out our application by using another helper:
./script/infoassistant_create.pl model AddressBook
This will create us a InfoAssistant/Model/AddressBook.pm file, along with basic documentation and tests. Our new module will inherit from the Catalyst::Model module.
NOTE: The lib/AddressBook.pm module is only here as a simple example, and not as a good or useful real module.
Have a look here: Melbourne_Perl_Mongers/8th_March_2006/AddressBook.pm for the actual AddressBook.pm module.
A couple of simple scripts are here to show you how to manipulate the simple module above: Melbourne_Perl_Mongers/8th_March_2006/add.pl and Melbourne_Perl_Mongers/8th_March_2006/list.pl
Then have a look here: Melbourne_Perl_Mongers/8th_March_2006/Model_AddressBook.pm for the InfoAssistant/Model/AddressBook.pm
then:
./script/infoassistant_create.pl controller AddressBook
This produces our controller; again with basic documentation and test cases. We then set ourselves a controller method that will be globally accessible:
sub addressbook : Global {
my ($self, $c) = @_;
$c->response->body('Hello World');
# $c->forward($c->view('TT'));
}
then we can view our new page by going to http://localhost:3000/addressbook .
./script/infoassistant_create.pl view TT TT
Our view is 'TT', and it inherits from Catalyst::View::TT . We can then edit our controller to use this view:
sub addressbook : Global {
my ($self, $c) = @_;
$c->forward($c->view('TT'));
}
Re-running our query now results in an error page. We haven't yet indicated which template we wish to use. Let's fix that:
sub addressbook : Global {
my ($self, $c) = @_;
$c->stash(template => 'address_list.tt');
$c->forward($c->view('TT'));
}
So we then create the file root/address_list.tt. Re-running our request now displays this page.
You can get the file here: Melbourne_Perl_Mongers/8th_March_2006/root_address_list.tt
It would suck to have to forward everything to the view every single time, so Catalyst gives us a shortcut that allows us to do that automatically. The lib/InfoAssistant.pm code has an end : Private method that automatically forwards bodyless requests to the desired view.
Infact, there's a catalyst plug-in called DefaultEnd that handles this automatically. There's talk of enabling this by default in the next version of Catalyst, as it's just so useful.
Catalyst allows PHP to be used as a view.
Now let's get some data and display it. In InfoAssistant/Controller/AddressBook.pm:
# /addressbook/view/xyz
sub view : Local {
my ($self, $c) = @_;
my $id = $c->request->arguments->[0]; # xyz
my $entry = $c->model('AddressBook')->ab->entry($id);
$c->stash(id => $id);
$c->stash(entry => $entry);
$c->stash(template => 'address_view.tt');
}
We also need to create a address_view.tt template which displays our phone-book entry. The template has access to the entire stash, and will use the contents for display.
You can view the file here: Melbourne_Perl_Mongers/8th_March_2006/root_address_view.tt
Now we can head to http://localhost:3000/addressbook/view/leif and get the details for Leif.
Now let's make a way to add entries. Our template includes the following for auto-complete functionality.
[% c.prototype.define_javascript_functions %]
[% c.prototype.auto_complete_function %]
You can get the full add file here: Melbourne_Perl_Mongers/8th_March_2006/root_address_add.tt
And in our controller we define a sub suggest : Local that returns a list of list of suggestions:
# Find a list of possible results, then...
$c->res->output(
$c->prototype->auto_complete_result(
[sort keys %ret]
)
);
You can view the full controller file with new add & suggest code here: Melbourne_Perl_Mongers/8th_March_2006/Controller_AddressBook.pm
We also need to use the Prototype plugin.
Now our add area has nifty AJAX auto-completion.
Find out more at http://catalyst.perl.org/ and the PerlNet article.

