Category Archives: software & technology

How FLOSS does not work

Pythonists are working on a stackoverflow alternative publicly available in the pypi ([pypi] [website] [github]). At the same time a company is working on a stackoverflow alternative privately for a customer.

Please repeat yourself and reinvent the wheel. This is where the software industry is going nowadays. The boundaries between company contexts and voluntary projects are too broad to be crossed in a copyright-driven society.

Inkscape PDF Export stopped working

Save As Copy with extension pdf worked very well for almost a year. Suddenly it stopped working. PDF export resulted in File {FILENAME}.pdf could not be saved. Very uncomfortable for me as far as I am using SVGs on a regular basis.

The internet is filled up the bug reports… some are having specific plugin problems, some only get blank PDF pages. Well… this Inkscape 0.47 bug report got the solution for me. Remove the preferences.xml file (for me ~/.config/inkscape/preferences.xml) and Inkscape will create a new one when starting up.

What is the reason? I think some important package got upgraded and Inkscape is holding some old, conflicting information about this package.

Tags: Inkscape 0.47 pdf export SVG could not be saved preferences cairo

switch/case in assembly

Because of a discussion with my brother I got curious about the internal handling of switch/case statements in assembly. So I compiled the stuff and studied it. Furthermore I wrote an article about it. I thought an external document is a better approach:

“switch in amd64 assembly” [HTML]

Twiki and 8 character passwords

At university, TWiki is a pretty common software. At least at the second Google search results page (for “twiki” as search term) I can see some twiki running at our university’s webserver. TWiki is written in perl and I will refer to the deprecated 4.1.x version which was my test system. I got annoyed by limited security for passwords. Passwords are limited to 8 characters.

Login Managers

During installation you will face a select field like this (in the “Security Setup” section):

Twiki Loginmanager in Installation

All those selections refer to different password management backends. Twiki::Client::ApacheLogin is implemented by /twiki/lib/TWiki/Users/ApacheHtPasswdUser.pm and Twiki::Client::TemplateLogin is implemented by /twiki/lib/TWiki/Users/HtpasswdUser.pm. In /twiki/lib/TWiki/Users/Password.pm the interface is defined. You can check out funny source code sequences like this:

86 —++ ObjectMethod checkPassword( $user, $passwordU ) -> $boolean
87
88 Finds if the password is valid for the given login.
89
90 Returns 1 on success, undef on failure.
91
92 =cut
93
94 sub checkPassword {
95 return 1;
96 }

Well… this is our interface. Let’s have deeper look into the implementation.

Twiki::Client::ApacheLogin

ApacheLogin uses the Apache interface to send 401 HTTP Status codes. If the client receives one of those status codes, a Username and Password Dialog pops up.

Password Dialog for 401 Status Codes

Using this dialog, the login information will be sent to the server. Using a loop in perl, we can print out what the server receives as CGI variables (the ones defined by the server and given to the perl interpreter). I have put the following source code into /twiki/lib/Twiki/Users/HtPasswdUser.pm subroutine new (don’t forget to include Data::Dumper).

my $key;
foreach $key (sort(keys %ENV)) {
print STDERR Data::Dumper->Dump([ $ENV{$key} ], [$key]);
}

From the Apache log, we will get the following information.

[...]
HTTP_COOKIE = 'TWIKISID=d00fe404e65832f9d95658d6d9112bec';, referer: /twiki/bin/logon/TWiki/TWikiRegistration
[...]
REDIRECT_REMOTE_USER = 'LukasProkop';, referer: /twiki/bin/logon/TWiki/TWikiRegistration
[...]
REDIRECT_STATUS = '401';, referer: /twiki/bin/logon/TWiki/TWikiRegistration
[...]

Actually I was looking for REMOTE_USER, which is a CGI variable only defined when Authorization was done. The cookie is not really interesting, but REDIRECT_STATUS approves that auth was done. REDIRECT_REMOTE_USER seems to be REMOTE_USER I am looking for… in some way. Alright… so what do we have here? Well… password and username associations are tested automatically by the Apache server and perl will not receive the password itself. Perl can assume that auth was done successfully and does not recognize it any further. Alright. So we have to determine where the passwords are stored.

Passwords for mod_auth are stored in .htpasswd files. A small UNIX find will return /twiki/data/.htpasswd. This file is updated for each change by the perl script.

LukasProkop:11/Yysc0Op9D2:unixuser@localhost

So the password is stored as a hash associated with the Login name and the local user name. Now let’s come to our real topic: Passwords with more than 8 characters. Let us create some additional accounts.

Username Password
KarlOrff 1234567
CamrinaBurana 123456789
DiesIrae 123456789123456789
SixteAjoutee 1234567689123456780

Well… our .htpasswd says:

CarminaBurana:UXjIprwRygc1.:unixuser@localhost
DiesIrae:UtCp6NoUsQdaQ:unixuser@localhost
KarlOrff:7kQC9KJ/39yA.:unixuser@localhost
LukasProkop:11/Yysc0Op9D2:unixuser@localhost
SixteAjoutee:R07ipKyeiYlho:unixuser@localhost

Now let’s log in with various accounts. As far as Twiki does not support a Logout button, the most comfortable way is to delete the cookie (see above) and refresh the page. Now we can see our problem: SixteAjoutee and DiesIrae can log in with each other ones password. The strange thing is, that their hashes are different. Our source code journey goes on…

Violation of second-preimage resistance?

$TWiki::cfg{Htpasswd}{Encoding} = 'crypt';

Our configuration file at /twiki/lib/LocalSite.cfg defines a variable for the various encoding algorithms. Of course such a variable is a perfect name to search for. The configure uses this variable, but HtPasswdUser.pm is the only other file.

The file encrypting the password is HtPasswdUser.pm at line 134. This file will apply the crypt function with a random salt to the password. The salt is 2 characters in length and stored at the front of the actually stored password. A small test script reveals the truth:

print crypt("123456789123456789", "Ut") eq "UtCp6NoUsQdaQ";
print crypt("123456789123456780", "R0") eq "R07ipKyeiYlho";

So there we have our problem. crypt uses the DES algorithm from the operating system and is limited to an input of 8 characters.

print crypt("12345678B", "Ut") eq "UtCp6NoUsQdaQ";
print crypt("12345678A", "R0") eq "R07ipKyeiYlho";

The collision-free solution

Of course the algorithm is the problem and a selection of another algorithm like sha1 (nope, no MD5!) would solve the problem. We do not rely on the operating system or missing implementations of other crypto algorithms.

#!/usr/bin/perl -wT

require MIME::Base64;
import MIME::Base64 qw( encode_base64 );
require Digest::SHA1;
import Digest::SHA1 qw( sha1 );

sub get
{
my( $passwd ) = @_;

my $encodedPassword = '{SHA}'.
MIME::Base64::encode_base64( Digest::SHA1::sha1( $passwd ) );
$encodedPassword =~ s/\s+$//;
return $encodedPassword;
}

print get("1234568B"), "\n";
print get("1234568A"), "\n";

This program returns two different hashes:

{SHA}sgDumzcRNpPJL8tCgM18JIR1ayc=
{SHA}RsUxZFkQgYAeTdsPmIixTYEdFgg=

Migration

How can thousands of user accounts be migrated to another algorithm? As far as the hash is stored as a one-way encrypted string, the encryption of the real password with another algorithm is almost impossible. I have written a small crypt() cracking program in python (sorry, Perl ;-) ), but of course it is way too slow; even for a single password. So the only solution is to reset all passwords of all users. First call the /twiki/bin/configure script and change the algorithm setting (“{Htpasswd}{Encoding}” in the “Security Setup” section) [0] and secondly, BulkResetPassword will help you reset the passwords for all users. It takes some effort and time, but in the end you will gain a higher level of security :-)

[0] It is also possible to directly modify the $TWiki::cfg{Htpasswd}{Encoding} line in /twiki/lib/LocalSite.cfg

HowTo: Bugfix file too large for wordpress importer

During the update to WordPress 3.2, I encountered (like many other people) the problem with limitation of the filesize for file uploads. Per default PHP will set it to 2MB whereas my wordpress export backup file (XML) has already 4MB. No, reconfiguring PHP’s upload_max_filesize was no option for me as far as the backup was already stored as a dump at the server. So my only thing to do was to replace the uploaded file with the already existing at the harddisk and make WordPress recognizing it. And this was not that difficult if you know where to look for. So here is my patch to get the wordpress-importer import a file already stored at the server hard disk. The wordpress export file to load has to be stored at wp-content/uploads/wordpress.import.xml.

diff --git a/wp-content/plugins/wordpress-importer/wordpress-importer.php b/wp-content/plugins/wordpress-importer/wordpress-importer.php
index 5e38484..e0cace0 100644
--- a/wp-content/plugins/wordpress-importer/wordpress-importer.php
+++ b/wp-content/plugins/wordpress-importer/wordpress-importer.php
@@ -102,6 +102,7 @@ class WP_Import extends WP_Importer {
         * @param string $file Path to the WXR file for importing
         */
        function import( $file ) {
+               $file = ABSPATH . 'wp-content/uploads/wordpress.import.xml'; #wp_import_handle_upload();
                add_filter( 'import_post_meta_key', array( $this, 'is_valid_meta_key' ) );
                add_filter( 'http_request_timeout', array( &$this, 'bump_request_timeout' ) );

@@ -132,7 +133,7 @@ class WP_Import extends WP_Importer {
        function import_start( $file ) {
                if ( ! is_file($file) ) {
                        echo '<p><strong>' . __( 'Sorry, there has been an error.', 'wordpress-importer' ) . '</strong><br />';
-                       echo __( 'The file does not exist, please try again.', 'wordpress-importer' ) . '</p>';
+                       echo __( 'The file '.htmlspecialchars($file).' does not exist, please try again.', 'wordpress-importer' ) . '</p>';
                        $this->footer();
                        die();
                }
@@ -188,7 +189,7 @@ class WP_Import extends WP_Importer {
         * @return bool False if error uploading or invalid file, true otherwise
         */
        function handle_upload() {
-               $file = wp_import_handle_upload();
+               $file = array('id' => 4869, 'file' => ABSPATH . 'wp-content/uploads/wordpress.import.xml'); #wp_import_handle_upload();

                if ( isset( $file['error'] ) ) {
                        echo '<p><strong>' . __( 'Sorry, there has been an error.', 'wordpress-importer' ) . '</strong><br />';

I did not put any further (compatibility) research into that issue. Worked for me™ with WordPress Version 3.2 and PHP Version 5.3.

WordPress FAQ: Import and Export

Python wants to become one-based?

Yesterday (or today in my timezone) Guido wrote a reply, which indicated that he wants python to become one-based. I have written recently about my WTF-situation when I realized that SQL is one-based (and also Lua which comes up in the python’s mailing list discussion). Personally speaking I think that python is that damn intuitive because it’s based on mathematical principles (which Guido as mathematician is aware of). But math is not that consistent and beautiful that it solves all problems.

A sequence of numbers is a range of numbers from x to y with x included but y excluded. To get all indizes of a list, you can use range(0, len(seq)). This is damn readable and does not include a nasty -1 like in most other languages. Well… life is not that beautiful. If we change to a one-based numbering system, what will happen? range(1, len(seq)+1)? Seriously? Lua does not exclude y and therefore does not have a nasty +1.

Lists start with 0, because 0 is the firstzeroth number. In a decimal system, the firstzeroth number with two digits is 10. Three digits: 100. The least-significant number is always zero. The firstzeroth number with one digit is 1? This only comes from an exception in mathematics (102..0).

If I want to have all numbers in a list at 2*n indizes in python I will use …

>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
>>> [a[2 * index] for index in xrange(0, len(a) / 2)]
[1, 3, 5, 7, 9]

Again: No nasty +1. Just like Wikipedia points out, it’s about congruence.

I understand that in certain situations it can be useful. If you want to get all numbers at 2n indizes, you probably want to have the mathematical base0=1 and you would be fine with one-based systems. You probably want to get the real firstzeroth number:

>>> import math
>>> a = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
>>> [a[2 ** index - 1] for index in xrange(0, int(math.log(len(a), 2)) + 1)]
[0, 1, 2, 13]

For Lua it just says:

a = {0, 1, 1, 2, 3, 5, 8, 13, 21, 34}
for index=1,math.log(#a+1, 2)+1 do
  print(a[2 ^ index])
end

I have shown only mathematical examples, but science is want I consider to be one of the most important fields for python. Other (especially old-school) languages really suck at this and it’s one of python’s strengths to be fine with scientists from other fields. In a mathematical context one-based systems it might be okay (see MATLAB), but for me a list starting with 1 is unpythonic and not useful. Yes, just like the Lua community I can live with a one-based languages, but I generally consider it to be a design fault.

Thanks to Bruce Leban for this awesome funny reply.

Update: Guido, you got me. It was a joke ;-)

When MySQL substr does not work

So let’s start with a simple MySQL setup:

mysql> CREATE TABLE example (prefix VARCHAR(50),
domain VARCHAR(30), PRIMARY KEY(prefix, domain));
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO example VALUES
("hello", "world.org"), ("foo", "bar"),  ("Foot","ball");
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

Alright… so we have some sort of “split up” email addresses. It’s not a problem to combine them together on the fly as far as MySQL provides basic string operations.

mysql> SELECT CONCAT(prefix, "@", domain) as email_addr FROM example;
+-----------------+
| email_addr      |
+-----------------+
| foo@bar         |
| Foot@ball       |
| hello@world.org |
+-----------------+
3 rows in set (0.00 sec)

So now let’s say, we want to remove the Top Level Domain from the domain (“world” instead of “world.org”) [1].

mysql> SELECT SUBSTR(domain, 0, LOCATE('.', domain)-1) as tld FROM example;
+-----+
| tld |
+-----+
|     |
|     |
|     |
+-----+
3 rows in set (0.00 sec)

What? Let’s slow down… take a substring of domain starting at position zero and with length of the position of the “.” (dot character) minus 1 (before that dot character). Okay… so something has to be wrong about it.

mysql> SELECT domain as tld FROM example;
+-----------+
| tld       |
+-----------+
| bar       |
| ball      |
| world.org |
+-----------+
3 rows in set (0.00 sec)

A quarter of an hour later I realized the problem with the help of a colleague:

mysql> SELECT SUBSTR(domain, 1, LOCATE('.', domain)-1) as tld FROM example;
+-------+
| tld   |
+-------+
|       |
|       |
| world |
+-------+
3 rows in set (0.00 sec)

SQL (and therefore the substr function) is one-based. So to address the first character you have to specify it using “1″. MySQL uses 0 to tell “no match”.

Stupid world. Why do we have conventions? To break them? I was aware that MATLAB sucks in this regard too, but I was shocked, when I heard about Lua (both languages one-based).

Note [1]. If you really have such an usecase, please refer to SUBSTRING_INDEX.

The XOR issue

In discrete mathematics, you often define chains of logical operations. XOR has an issue, not everybody is aware of. If it’s used with 3 variables, it loses its most important behaviour: Indicating that only one value is set to True.

A B C A ⊕ B ⊕ C
True True True True
True True False False
True False True False
True False False True
False True True False
False True False True
False False True True
False False False False

In the end, the XOR keeps associative for 3 variables: (A ⊕ B) ⊕ C = A ⊕ (B ⊕ C). For any greater number of variables the behaviour is in no intuitive relation with 2-var-XOR any longer:

A B C D A ⊕ B ⊕ C ⊕ D
True True True True False
True True True False True
True True False True True
True True False False False
True False True True True
True False True False False
True False False True False
True False False False True
False True True True True
False True True False False
False True False True False
False True False False True
False False True True False
False False True False True
False False False True True
False False False False False

If you want to experiment with XOR, I want to refer to my article Truthtable with python and XOR is in python a caret “^”.

Advertisments not for everybody

Advertisement showing source code instead of content

Probably such advertisements are not attractive for everybody :-/

HowTo subscribe to the (non-existent) twitter user RSS feed

I will explain it for Mozilla Thunderbird since this is my current feed reader. Just leave out steps 3, 4, 6 and 7 for any other feed reader.

  1. Get the name of the twitter user.
  2. Get the ID of the twitter user by name.
  3. Open Thunderbird, select the top-level element containing your RSS feeds.
  4. Select “Manage subscriptions” and click “Add”.
  5. The feed URL is http://twitter.com/statuses/user_timeline/<userID>.rss (with the placeholder replaced)
  6. Click “Ok” to subscribe and move the RSS-Feed to any subfolder you would like to place it.
  7. Prefer “View” > “Feed Message Body As” > “Plain Text” and “Summary”

Why “non-existent”? Because Twitter has dropped support for user RSS feeds some time ago. The feeds still exist, but should not be used. However I like RSS and was using this feature for the last 2 years continuously. This decision annoys me :-(