Rewriting Genpass in Rust

Sep 21, 2024

So recently, I decided to take a shot at creating my first ever project in Rust.

Rust is a high-level language with low-level performance. Its main selling point to most developers is that it is a type-safe language, with a ton memory safety features while still offering the "blazingly fast" speeds of C. As a developer who has written projects in C, I decided that I wanted to learn this language, and to be honest— fell in love with Rust.

Development made clean, fast, and easy

It all started a few months ago. I had this idea of building GenPass in Rust to take advantage of Rust's CUDA libraries to utilize the GPU in password generation. While this excludes the use of AMD GPUs, I have yet to implement CUDA functionality. I did however add a ton of new features that Rust made really simple to implement.

For example in GenPass, you'll notice that all of the code is written into one file. This is because I simply don't have the time, or more really, the motivation to implement a whole library system that would take additional steps to compiling the binary. make is a great tool and it definitely has its use cases when it comes compiling C code, but I would have to manually write the makefile if I made any additions. I could technically auto-generate the makefile, but that's still more keystrokes and time spent to make even a test binary.

Rust's Cargo package manager makes this process so much more easy to find and make use of not only external libraries, but adds these dependencies automatically to my project. The typesafe system, paired with the extremely powerful Rust compiler made development so much easier to understand and learn the language as I went on.

Getting Started Refactoring + Rewriting.

So the first thing I wanted to test myself on was to see if I could just generate strings of random characters of a user-defined length. The interesting thing about strings in Rust is that by default, Rust uses UTF-8. This brought about a string of headaches when I was building support for special character generation because:

  • I would have to account for the bytesize of characters outside of ascii.

  • Instead of simply counting the characters, I would now have to account for the size of the character.

This made a huge impact when I was developing the project because I now had to develop an algorithm to ensure that the correct amount of characters would be procedurely generated.

Additionally, between all of that— since Rust's Random Library comes with a fully capable Cryptographically Safe Pseudorandom Number Generator (CSPRNG), I could also remove the usage of the original Gordian Knot Algorithm, which used plenty of cheap tactics like Process ID factoring to help seed random numbers into the random number generator.

The headaches of UTF-8

As an example of how annoying this part of the project was I am going to present you to 2 characters: 'Ó' and 'O'

Now normally, these would be both regarded as one character. However, Rust's .char().count() method counts 'Ó' as 2 due to how accents and special characters work under UTF-8. The reason 'Ó' counts as length 2 is because 'Ó' is actually 2 bytes while 'O' is a single byte. .char().count() doesn't seem to register this, and only returns the vector size. This brought about a huge headache because initially, I was simply just creating a vector of chars and pushing each generated char into said vector, which resulted in string sizes being generated less than what the user had specified. Now, I have to find a new way to account for the bytesize and the total length of the string.

Finding a solution

Initially— I tried to use a plentitude of different Libraries to help try and solve my issue, and to my misfortune— no solutions were out there to really help with my issue. So I did what I did best and I had to sit down for a moment and come up with a solution to this issue. After a few weeks of putting the project off, I finally came back to it after I had an epiphany: count the size in bytes of the string while the characters are generating and have a seperate counter to account for the current size in bytes; If the current size in bytes is equivalent to the target bytesize, but NOT exceeding it's maximum potential bytesize, I should be able to generate the correct amount of characters each and everytime. So the final result was this block of code here:

// stringgenerator.rs

let mut bytesize: i16 = 0;
let mut target_bytesize: i16 = {length_of_string};
let max_size: i16 = target_bytesize * 2; // since one char can be 2 bytes.

while bytesize != target_bytesize {
   // *Insert character generation alg here.*
  
  if x > 128 && target_bytesize < max_size {
    target_bytesize = target_bytesize + 1;
    bytesize += 2;
  } else {
    bytesize += 1;
  }
}

After about an hour of testing, and testing this through 100,000 different iterations— the character generating algorithm worked! The strings generated were now the correct size, and genpass could now create passwords with extended ascii.

Performance

Now to put Rust to the test: is it really up to the task of being as fast as C? I ran a quick time test, and saw that GenPass written in C had 4-5% less CPU usage pushing both through a 1000 iterations of generating ASCII passwords of length 20, with runtimes fluctuating between ±40ms.

An example of one of the tests.

Well what about Password generation performance? The Rust version of Genpass was actually very underwhelming in generating passwords, seeing only an average of <1% increase in bit entropy. However, the median increased to 121.016 from 119.835.

Some interesting notes from the data sets I noticed:

  • 131.397 seems to be the theoretical cap for ASCII Passwords of length 20, as this number was the maximum for both datasets.

  • Genpass had a higher minimum entropy of 88.236, as compared to 86.996 from the Rust variation.

  • As time goes on, I may check to see how much more frequent the bit entropy are greater than 124 are in both programs.

Conclusions

As Rust continues to develop, I may transition all of my time learning into the language, as it seems that this language is the new future of developing secure applications. Many government agencies across the world are shifting over to Rust with TRACTOR (Translating all C to Rust) being pushed by DARPA. It seems that Rust is going to be high in demand as memory management issues are the source of many problems in modern computing— leading to many security vulnerabilities being available. As the ecosystem of development changes, so does the realm of security surrounding these applications. I will keep you guys updated on my Rust learning journey as I continue creating projects in this awesome language. As always, thank you for reading— and check out the program details and source code here.