Data Encoding Methods

getrithekd · January 3, 2024, 6:25am

This guide and the comments will be about different ways to encode data into numbers. I have 3 ways: digit, factor, and DNA.

Digit

This is the standard encoding everyone uses. It puts the data in base n and then encodes the data into digits to store it.

Factor

This is the first step into oblivion. You encode the data into the exponents of the prime factorization of a number. Idk how to use it, and that’s part of what this post is about.

DNA

This is pretty crazy and hurts my mind a bit. So in DNA, you have 4 different things you can use to build the molecule, bases A, C, G, and T. You also have 2 strands that click together. A and T click together and C and G click together. The strands are built out of the bases. Think of one strand as a number, and each base as a digit. So far, this is just plain base encoding, until you think of all the unused possibilities when using base encoding. DNA codes for about 20 different possibilities with those 4 digits. So that means 3 number places will encode a possibility. But that doesn’t use the other 44 possibilities. So that’s what the other strand takes advantage of. It has to have the opposite digits, but we can code the other 44 possibilities into what we want, effectively getting double the data from one property.

Using It

So now, how can we use these in GKC to make stuff like text displays and whatnot? Which one is best for GKC? Please comment on this.

vqnillaxx · January 3, 2024, 6:44am

woah

Grey_Stone · January 3, 2024, 11:03am

I think that factor encoding/decoding would have the most balance between memory efficiency and security. However, depending on your exponent, you might lose some data in the process.

CassiusDoomlorde · January 3, 2024, 12:46pm

ugh wow too hard

I’m just going to take a nap

ClicClac · January 3, 2024, 12:57pm

Yeah, properties can’t handle the factoring method. Those numbers get big fast, and we would probably run out of room rather quickly.

wingwave · January 3, 2024, 1:22pm

The problem is that Gimkit Creative is so not-complex and what we’re thinking is. We need to find a solution that’s simple enough for us to encode into Creative.

getrithekd · January 3, 2024, 1:24pm

So I’m thinking that factor encoding is out bc of just getting the exponents in the first place. DNA encoding is just base encoding 2.0, but harder to use.

wingwave · January 3, 2024, 1:25pm

here comes shdwy…

Shdwy · January 3, 2024, 1:26pm

Of course this guy goes straight to DNA

rithek what is it with you and chem bro

DNA feels… extra.
We could probably use the most efficient method of storing binary data, which is (can you guess it?) binary encoding.
That and digits feel the easiest. They still take a bucketload of blocks to unpack, but that’s just gonna happen.

wingwave · January 3, 2024, 1:28pm

What type of “data” are you thinking about?

getrithekd · January 3, 2024, 1:30pm

It’s where I got the idea from.

getrithekd · January 3, 2024, 1:30pm

Idk. Like any sort of large source to be compiled.

Grey_Stone · January 3, 2024, 1:59pm

You could probably also use binary to encode data.

CassiusDoomlorde · January 3, 2024, 1:59pm

That’s digit-type, in base 2.

TorontoBulls1 · January 3, 2024, 2:01pm

I think I have used factoring before but that might not be the right term. Its pretty useful.

I used it when i was making a drop items on death system (not finished). It used concentated properties that tracked how many of each item you have through IIMS. Then I would eventually need to store the item number (property name) and the item amount (value of property) as one variable. I took the item number * 1000 (because it players cant get 1000 of my items) and then added the item number.

E.G. property 5 with a value of 39 would result as 5,039 and then could be deciphered

Is this factoring or something different?

Tell me if i need to elaborate more

getrithekd · January 3, 2024, 2:13pm

That’s digit encoding.

TorontoBulls1 · January 3, 2024, 2:27pm

Ok

Gimkitsuggestor · January 3, 2024, 8:06pm

man gimkit nowadays is becoming its own scientific website techinally were professional coders at there finest

Blackhole927 · April 12, 2024, 1:27pm

I just had an epic thought- base 64000 encoding!

With the addition of text objects, this should be possible
Basically, we assign a set of random alphanumeric characters (plus spaces and punctuation) to a single unicode character, and we do that for about 64000 characters.

This allows us to assign a unique set of 3 alphanumeric characters (plus spaces and punctuation) to each of these unicode characters. For example, the unicode character 你 might end up representing “ah3”, and ⦓ might end up representing “fds”.

Right now, using no compression it’s possible to store about 7,168,000 unicode characters within gimkit, at 100% memory. Applying this 3x compression algorithm, we sacrifice the ability to use other languages but get the ability to store 21,504,000 characters!

How big is this? Well, if we assume that the average word is 5 letters long, we learn we can store The entirety of the Harry Potter book series in Gimkit, THREE TIMES OVER! (Please, for legal reasons, do not do this. Violating copyright is bad.)

So… yeah. Base 64000 encoding

DXCTYPE · April 12, 2024, 1:34pm

oh yay another reseach topic that i needed and I didn’t know existed.