C: Processing Windows Bitmaps Or DIBs With The Propeller?

24

Comments

  • idbruceidbruce Posts: 5,537
    edited April 9 Vote Up0Vote Down
    Jason

    Thanks... That will help me immensely.

    Earlier I mentioned....
    The more I think about the new proposed route, the more I realize how complicated it will be. I had forgotten about the string reversals, which means that the pixel count is also being reversed and is currently not accurate when alternating scan lines or when aligning images to the right. The program will need some serious reorganization to make this work.

    Additionally, the above pseudo-code would not be useful for plot files with alternating scan lines.

    EDIT: Okay, I was over thinking this problem. Instead of reorganizing the code, all I really have to do is create my own string reversal function, which reverses the commands, but not the values associated with them.

    Unless I am mistaken, which I could be, I believe an alternated scan line or right aligned image should be reversed to look like so....

    Before Reversal: M12345L67890M9876L54321N
    After reversal: L54321M9876L67890M12345N

    If anybody sees this as wrong, please present your case :)


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • JasonDorieJasonDorie Posts: 1,925
    edited April 9 Vote Up0Vote Down
    If you reverse the 0's and 1's string, then generate the M/L-codes from that result, you're fine. If you try to reverse the already encoded M/L string, it'll be a big pain in the butt. :)

    That said, once the M/L string is encoded into a series of shorts, they aren't a string any more - they're just an array of 16-bit entities in memory. Reversing a line would just be scanning from the start until you hit the one with the "EOL" command embedded in the upper three bits, and then reverse the order of everything from start of line, up to but not including, the EOL command. You keep thinking of L1234 as a 5-character string. I think of it as a single encoded short - much easier to work with.
  • Jason
    If you reverse the 0's and 1's string, then generate the M/L-codes from that result, you're fine. If you try to reverse the already encoded M/L string, it'll be a big pain in the butt.

    Ahhhh... but I now have a plan and I must say that I am pretty darn good with the CString class. CString is a wonderful thing :)

    I am working on it now and I don't believe it should take that long to write the new reversal function. However, it will most likely take significantly longer to generate the plot file, but I will see. I think I should have the function finished, in about 15 -30 minutes, but I could be wrong, because it has been a while since I have done some serious string reformatting.

    It takes approx. 30 seconds to generate the plot in all versions. I estimate approx. 45 second with a new reversal function.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • idbruceidbruce Posts: 5,537
    edited April 9 Vote Up0Vote Down
    Okay here is one test function and another function that does the actual reversal
    CString CReverseStringButNotValuesDlg::ReverseCommands(CString *strScanLine)
    {
    	CString strReversed;	
    
    	while(strScanLine->GetLength() != 0)
    	{
    		int nL;
    		int nM;
    
    		nL = strScanLine->ReverseFind('L');
    		nM = strScanLine->ReverseFind('M');
    
    		if(nL > nM)
    		{
    			strReversed += strScanLine->Right(strScanLine->GetLength() - nL);
    			strScanLine->Delete(nL, strScanLine->GetLength() - nL);
    		}
    		else
    		{
    			strReversed += strScanLine->Right(strScanLine->GetLength() - nM);
    			strScanLine->Delete(nM, strScanLine->GetLength() - nM);
    		}
    	}
    
    	return strReversed;
    }
    
    void CReverseStringButNotValuesDlg::OnTest() 
    {
    	CString strString = "M12345L67890M9876L54321";
    	CString strResult;
    
    	strResult = ReverseCommands(&strString);	
    
    	MessageBox(strResult, "Result", MB_OK);
    	// Should end up being L54321M9876L67890M12345
    }
    

    And an image to verify correct processing

    The 'N' will be added after the conversion


    487 x 356 - 18K


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • Jason
    That said, once the M/L string is encoded into a series of shorts, they aren't a string any more - they're just an array of 16-bit entities in memory. Reversing a line would just be scanning from the start until you hit the one with the "EOL" command embedded in the upper three bits, and then reverse the order of everything from start of line, up to but not including, the EOL command. You keep thinking of L1234 as a 5-character string. I think of it as a single encoded short - much easier to work with.

    I realize this, but my thoughts are that since I will need a PC to to scan, mirror, flip, etc... the *.bmp files, I might as well do all the processing I can possibly do with the PC, to remove any unnecessary tasks from the uController.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • I didn't mean do the reversing of the shorts in the uController - you can still do that in the PC too - just put them into a temporary array, flip them, then write the array to disk. No matter though - your way will work just fine too. In fact, it'd probably be easier just to flip the source 0's & 1's string before you run-length encode it, then you don't need the special string-flip function.
  • idbruceidbruce Posts: 5,537
    edited April 9 Vote Up0Vote Down
    Jason

    When I started the new version containing the MLN commands, I eliminated the 1 and 0 formatting. Instead, I have been counting pixels until the color changes, and inserting the pixel count after the M and L commands. So instead of reversing the direction of the bits, I am now reversing the direction of the commands, but keeping integers as is, meanwhile keeping the integers grouped with it's associated command.

    Either way, that problem is now fixed. I am now looking at the problem of the MNs which have no integer between them, which should be an easy fix. After that, I will start working on the function to write the shorts.

    Shouldn't be much longer until the new version is finished. At least I hope :)


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • idbruceidbruce Posts: 5,537
    edited April 9 Vote Up0Vote Down
    I am now looking at the problem of the MNs which have no integer between them, which should be an easy fix.

    After writing some code to fix the MNs which have no integer between them, I ended up with the following result:
    M1793NM1793NM1793NM1793NM1793NM1793NM1793N
    

    Please notice that the MNM combination of a previous post has now become NM, which means that it is now basically correct. As mentioned, the test image has a white border and the portion of output shown above, indicates a white border for seven scan lines.

    This is basically what I want, but there is still a problem. Considering that the starting point, X = 0, the output above, for a single scan line, indicates that a move of 1793 pixels is taking place before the next scan line is encountered. Now consider that even though the starting point X = 0, it still represents a white pixel. So we have X0 (white pixel) + 1793 other white pixels = 1794 white pixels for the entire scan line.

    The remaining problem is that the test bitmap is actually 1795 pixels wide :( "rut row"

    I do not think it will be that hard to find the source of the pixel loss, but now I wonder about the other scan lines, that are not a border. So testing is definitely required.

    EDIT: After a little research, I discovered that I have other problems.... I will keep you updated


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • idbruceidbruce Posts: 5,537
    edited April 9 Vote Up0Vote Down
    For Those Who May Be Interested

    It appears as though the previous problems have been solved.

    When I began creating this latest version, I contemplated modeling it after the Windows GDI functions MoveTo and LineTo, or MFC which also have MoveTo and LineTo functions, but I opted not to, because the LineTo functions require an endpoint coordinate which actually extends one unit past the line and becomes the current position. For that reason, I developed my own strategy, of starting with a zero count, and counting identically colored pixels, from that point on.

    As it applies to my new version, this strategy has several affects on the output. The number which follows a M or L command is always one pixel less than the width of the move or line. This is no big deal, because I knew something like this would happen. The workaround is to always move one pixel forward before processing the command, with the exception of moving to the next scan line. It is also worth mentioning that pixel count loss is accumulative. Lets assume the width of my test image, which is 1795. Now let's say that the 10th line of the program output (which represents the 10th scan line of a bitmap has 12 commands, being 7 moves (M) and 5 lines (L). The total pixel count shown by the output would equal 1783 pixels, but if you add the total number of M and L commands, within that scan line to the 1783 pixels, you get the actual image width.

    1783 shown output pixels + 12 commands = 1795 test image width

    I think I can live with that.

    With the errors eliminated and still in text form, the output plot for the 1795 X 1622 test image has a file size of 104 KB.

    EDIT: The actual storage size for the bitmap (*.bmp file) is 364 KB. So the generated output has an actual file size which is less than 1/3 of third of the bitmap. I just can believe it.

    Here is another sample of the output, which is formatted to make it easier to read.
    M48L106M14L80M13L151M18L162M20L697M21L80M21L105M50L9M54L79M48N
    M48L105M13L84M12L149M18L164M20L695M20L84M20L103M47L17M50L79M48N
    M48L104M13L86M12L147M18L166M20L693M20L86M20L100M47L21M48L79M48N
    M48L103M13L88M12L145M18L18M130L18M20L691M20L88M20L98M46L25M46L79M48N
    M48L102M13L90M12L143M18L18M132L18M20L284M42L362M19L90M19L97M45L29M44L79M48N
    M48L101M12L94M11L141M18L18M134L18M20L282M43L361M18L94M18L96M33L54M31L79M48N
    M48L100M12L96M11L139M18L18M136L18M20L280M46L358M18L96M18L93M32L60M28L79M48N
    M48L99M12L98M11L137M18L18M138L18M20L278M48L357M17L98M17L92M31L64M26L79M48N
    M48L98M12L100M11L135M18L18M140L18M20L276M50L355M17L100M17L90M31L66M25L79M48N
    M48L97M12L102M11L133M18L18M142L18M20L274M52L353M17L102M17L88M30L70M23L79M48N
    M48L97M12L102M11L132M18L18M144L18M20L272M54L352M17L102M17L87M30L72M22L79M48N
    M48L96M12L104M11L130M18L18M146L18M20L270M56L350M17L104M17L85M30L74M21L79M48N
    M48L95M12L106M11L128M18L18M148L18M20L268M58L349M16L106M16L84M30L76M20L79M48N
    M48L95M11L108M10L128M17L18M150L18M20L266M59L348M16L108M16L82M31L76M20L79M48N
    M48L94M11L110M10L126M17L18M152L18M20L264M62L345M16L110M16L79M32L35M6L35M19L79M48N
    M48L94M11L110M10L125M17L18M154L18M20L262M64L344M16L110M16L78M32L33M12L33M18L79M48N
    M48L93M11L112M10L123M17L18M156L18M20L260M66L342M16L112M16L76M33L32M14L32M18L79M48N
    M48L93M10L114M9L123M16L18M158L18M20L258M68L341M15L114M15L75M33L32M16L32M17L79M48N
    M48L92M10L116M9L122M15L18M160L18M20L256M70L339M15L116M15L73M34L31M18L31M17L79M48N
    M48L91M11L51M12L51M10L120M15L18M162L18M20L254M20L30M20L337M16L51M12L51M16L71M35L30M20L30M17L79M48N
    M48L91M10L48M20L48M9L120M14L18M164L18M20L252M20L32M20L336M15L48M20L48M15L70M35L30M22L30M16L79M48N
    

    So now I move onto creating the shorts to store the commands and their values.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • Jason

    I must admit that I am temporarily lost. I have been attempting to study bitwise operations for a large part of the day, but it is all so confusing, with bits and pieces here and there.

    Can you please help?

    You gave me an example with 3 bit command set, which I do not fully understand, although I understand a fair portion.

    However, I am more interested in the 2 bit command set that you first proposed. Such as:
    Make a sequence of shorts (16 bits) where the upper 2 bits are:
    00 = MN
    01 = M
    10 = L
    11= end-of-stream
    The next 14 bits are your move/line distance.

    However I was thinking

    00 - M
    01 - L
    10 - N
    11 - E

    Should I shift these bits separately into their correct positions, and what data types do I use to initially for these bits, as well as the data type for the length of move? Can I use an int type for the move length and shift that into the shorts?

    The string version would be easier, but when you mentioned 16 bit shorts, with the 2 bit command sets, it really sounded like the way to go.

    I cannot find an example of this anywhere.

    I know that you have given me quite a bit of your time and help already and I really hate to ask, but I am lost.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • The commands could be any data type large enough to hold the bit pattern. So char, short, or int will all work fine, either signed or unsigned won't matter because you're only using positive values, and later when decoding you'll mask the unwanted bits back off.

    So your commands listed above could be:
      char Cmd_M = 0;  // 00
      char Cmd_L = 1;  // 01
      char Cmd_N = 2;  // 10
      char Cmd_E = 3;  // 11
    
    In C, if you use a value like that in an expression that will end up in a larger data type, the compiler will auto-promote it to the larger type before doing math, so something like this will work:
      short CommandValue = Cmd_L << 14;
    
    That creates a short, and sets the upper two bits to 01. To add in the length of the move or line, you'd do something like this:
      int Distance = MoveDistance;   // This is the previously computed length for the line or move
      Distance = max( Distance, (1<<14) - 1);  // Make sure it doesn't exceed our max possible move distance
      short CommandValue = (Cmd_L << 14) | Distance;
    
    If you know in advance that 14 bits will always be enough, you're basically done. If you might have lines longer than that, you'd need something like this:
      // Keep going while we haven't fully encoded the entire move
      while( MoveDistance > 0 )
      {
        int Distance = MoveDistance;
        Distance = max( Distance, (1<<14) - 1);  // Make sure it doesn't exceed our max possible move distance
        short CommandValue = (Cmd_L << 14) | Distance;
    
        file.Write( &CommandValue, sizeof(CommandValue) );
    
        MoveDistance -= Distance;  // Remove the amount we moved in this command
      }
    
    Note that the file.Write() is assuming that the short ends up in the correct endian order for the Prop. If it doesn't, you'd need to endian swap it before you write it, like this:
      CommandValue = ((CommandValue >> 8 ) & 255) | (CommandValue << 8)
    
    Apologies if this is obvious: 255 is "the lower 8 bits". 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1
    You'll also see it everywhere as 0xff, which is just that written in hexadecimal.
    To decode your commands on the Propeller, you'd use this:
      // Shift the value down by 14 bits to slide the upper two bits down to bits 0 & 1.
      // bitwise and with 3 == (11) in binary, to extract JUST those two bits.
      // The & is necessary if the encodedShort is signed.  Unneeded if it's unsigned
      int Cmd = (encodedShort >> 14) & 3;
    
      int Distance = encodedShort & ((1<<14) - 1);  // Easy way to get "all bits up to" the Nth one
    

    The types for the move and bits can be int, or anything big enough to hold their full range. You may get warnings when moving them into a short, but you can suppress those by using a cast, like this:
          short CommandValue = (short)( (Cmd_L << 14) | Distance );
    
    ..or like this, if you prefer:
          short CommandValue = short( (Cmd_L << 14) | Distance );
    

    I used char, short, int here mostly for compactness. You would probably find life slightly easier using unsigned char and unsigned short, because you wouldn't have to worry about the sign bit getting extended.

    The Cmd_N command, for example, when moved up into the upper two bits, would look like 10xxxxxxxxxxxxxx (in binary). That upper bit would be considered the sign bit in a signed short, so when you shift it back down, you'd get 1111111111111110 because it's a "negative" number. The lower two bits are still right, you just have that sign bit repeated all the way down, which is why you do the &3 to get rid of it. If the encodedShort value is unsigned, shifting it down doesn't repeat the sign bit, so you don't need the &3.

    Does that all make sense?
  • JasonDorie wrote: »
    The commands could be any data type large enough to hold the bit pattern. So char, short, or int will all work fine, either signed or unsigned won't matter because you're only using positive values, and later when decoding you'll mask the unwanted bits back off.

    To simplify things a bit, you can use bit definitions in a struct.

    This defines a 16 bit data structure with two elments that can be addressed like any other element in a struct, the compiler will generate the appropriate bit masks and shifts:
    struct _data {
        uint16_t arg :14;
        uint16_t cmd :2;
    } __attribute__ ((packed));
    
    struct _data commandValue;
    
    commandValue.cmd = 1;
    commandValue.arg = 255;
    
    int cmd = commandValue.cmd;
    int distance = commandValue.arg;
    

    The structure can be used as any other data type as a single or as an array (struct _data commandValue[128]) or as part of other structures, etc.

    Moreover, you can use union to access the whole struct as a base data type:
    struct _data {
        union {
            struct {
                uint16_t arg :14;
                uint16_t cmd :2;
            };
            uint16_t w;
        };
    } __attribute__ ((packed));
    
    struct _data commandValue;
    
    commandValue.w = 0b0100000011111111;
    int cmd = commandValue.cmd; // == 1
    int distance = commandValue.arg; // == 255
    

    Don't put a space between the : and the bit number! That's the syntax.

    Hope this helps.
  • JasonDorieJasonDorie Posts: 1,925
    edited April 10 Vote Up0Vote Down
    I left that out because different compilers may choose to pack the bits from the bottom or the top. I can't recall if there's a standard or not, but Macca is right - this works, and as long as you inspect the results to be sure what you're getting with your particular compiler, it can remove a lot of the funky shift code.

    I sometimes use functions when I have to be sure it will survive across platforms. In C++, a struct can have member functions, like this:
    struct CMDSHORT
    {
      unsigned short value;
    
      void SetCmd( int cmd ) {
        value = (value & 0x3FFF) | (cmd << 14);
      }
    
      void SetDist( int dist ) {
        value = (value & 0xC000) | dist;
      }
    }
    
    CMDSHORT c;
    
      c.SetCmd( Cmd_L );
      c.SetDist( moveDistance );
    
      file.Write( &c.value, sizeof(c.value) );
    
  • @Jason

    That is outstanding. Thank you once again for coming to the rescue. Your response was very clear and concise, and I couldn't have asked for better instructions. I should have no problem finishing up this new version with the information that you provided.
    If you know in advance that 14 bits will always be enough, you're basically done.

    From the internet, I have found that the largest value for 14 bits is 16,383. My goal is to shoot for 600 dpi, and a max value of 16383, should be able to provide me with an image area of 27.3 in. X 27.3 in., which is more than sufficient for my needs and probably most other peoples needs.
    I used char, short, int here mostly for compactness. You would probably find life slightly easier using unsigned char and unsigned short, because you wouldn't have to worry about the sign bit getting extended.

    I believe that unsigned char and unsigned short are both valid data types in Visual C++ 6.0, if so, that is what I will use.

    I cannot thank you enough for all of you assistance, guidance, patience, and helpful snippets. To be perfectly honest, I am more fond of this version then the other two proceeding ones. It is the neatest and best documented version out of the three, and as you have hinted, it will most likely be the most efficient.

    I suppose the least I can do, besides thanking you, is to let you know what I am up to. I was in the middle of building a rotary plotter prototype, and everything was going just fine, when a new idea popped into my head, for another prototype. Of course I got sidetracked, and I have not worked on the rotary plotter since then, but there is no rush on it and I am well on my way with that project. However, hopefully I will have them both done by the beginning of summer (yeah right, wishful thinking). Anyhow, I will be building a new and simple prototype, for laser direct imaging of PCBs. Instead of utilizing Gerber files for the PCB, I will be exporting bitmap images out of Eagle CAD, at 600 dpi. Instead of worrying about interpolation of X and Y, I intend to attempt raster scans, which I can easily code myself for the Propeller. The output from this program, should enable me to easily experiment with LDI and the Propeller. The prototype itself will not be anything too complicated, just a small and simple machine, which I can picture mostly in my head, and have most of it drawn already. This program will be finished today, and then I can finish my design and begin construction.

    The details are that each pixel represents a stepper motor step, with an M command equaling Pen Up and move horizontally, an L command equaling Pen Down and move horizontally, an N command equaling Pen Up and move vertically 1 step (pixel), and E indicates that the job is done.

    The machine will be simple, however focusing a tiny, little dot to make a proper exposure on the film, I am certain that it will be very complicated and I am certain that it will require quite a bit of experimentation.

    Anyhow Jason, that is what I am up to. Thank you for all of your assistance.

    Bruce

    @macca

    The use of a struct makes very good sense, but I think I am going to stick with the code that Jason provided. However, I will keep what you said in mind, because that is very handy information.

    Thank you for your input.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • Jason
    I left that out because different compilers may choose to pack the bits from the bottom or the top. I can't recall if there's a standard or not, but Macca is right - this works, and as long as you inspect the results to be sure what you're getting with your particular compiler, it can remove a lot of the funky shift code.

    I sometimes use functions when I have to be sure it will survive across platforms. In C++, a struct can have member functions, like this:

    I am glad that Macca brought the subject up, because after seeing your response and code, I now like the idea of using a struct. Now let's see if it is a cut and paste into Visual C++ 6.0 :)


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • Now let's see if it is a cut and paste into Visual C++ 6.0

    Well, it was not a cut and paste job, but this compiles...
    typedef struct
    {
    	unsigned short value;
    
    	void SetCmd(int cmd)
    	{
    		value = (value & 0x3FFF) | (cmd << 14);
    	}
    
    	void SetDist(int dist)
    	{
    		value = (value & 0xC000) | dist;
    	}
    
    } CMDSHORT, *LPCMDSHORT;
    


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • idbruceidbruce Posts: 5,537
    edited April 10 Vote Up0Vote Down
    Okay, so I had a freakin' error, which sent me into an endless loop. Darn thing took me a couple of hours to figure out where it was :(

    Anyhow, I think I am getting a much better picture of what is really going on inside of a file, but please correct me if I am wrong.

    A file is nothing more than a container for various bits of data, which can contain any data type. However, to successfully read the data or bits, you must know specific data type being used at a specific locations, so that you may acquire the data at an intended offset. In other words, a long, a short, and a byte can all be written to the same file, just as long as you know their location. So if a file has all three in the mentioned order, you would first read 32 bits, then 16 bits, and then 8 bits, to get their individual values.

    Is this assumption correct?

    I am asking for the sake of general knowledge and because I now believe that I need a different solution for a header. First off, there are 4 different possible configurations for the file data, which are as follows:

    00 - Bottom Image Or Top Image Left Hand Aligned Without Alternating Lines
    01 - Bottom Image Or Top Image Left Hand Aligned With Alternating Lines
    10 - Bottom Image Right Hand Aligned Without Alternating Lines
    11 - Bottom Image Right Hand Aligned With Alternating Lines

    To enable proper decoding of the data, the data arrangement should be known at run time, therefore it only makes good sense to save this information to the header.

    Since 14 bits is being used to describe line and move lengths, this software should have the ability to handle images up to 27.3 in. X 27.3 in., therefore it also makes good sense to be able to store the width and height dimensions in 14 bit arrays.

    So my thoughts are to use a long in a similar fashion, as was done with the shorts to store the commands and movements. The data arrangement would go into the upper 2 bits, followed by the width capable of 14 bit storage, followed by the height capable of 14 bit storage also, thus leaving 2 bits left over.

    Can long and shorts be stored in the same file?

    And just for general information..... Without the header included in the output file.... Utilizing the same test image for all three versions..... When stored on a Windows OS.....

    First Version (Text Version): 2847 KB File Size
    Second Version (Binary Version): 357 KB File Size
    Third Version (Binary Version With Unique Encoding): 67 KB File Size


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • Yes, your assumptions are correct. Files are just sequential bytes of data. How the application interprets them is the important part. If you stored a long and a byte into a file you just get 5 bytes in the file. The order of the bytes from the long variable will depend on the endianness of the processor - it's just written from memory in order.

    As for the header, what you describe will work fine.
  • Jason

    Perfect!

    Thanks to this application and your help, I now have a much better idea and understanding of files. I never really put much thought into it before.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • kwinnkwinn Posts: 7,645
    edited April 10 Vote Up0Vote Down
    You're basically correct, but files can get more complicated than that. Wikipedia has a reasonably good explanation.

    Typically files are stored on disks or sd cards as blocks of bytes (typically 512 bytes per block). Bits can be stored 8 to a byte, characters or numbers from 0-255 as one byte, 16 bit value as 2 bytes, and 32 bit values as 4 bytes. Up to the software to determine how the bytes in the blocks are treated.
    In science there is no authority. There is only experiment.
    Life is unpredictable. Eat dessert first.
  • idbruceidbruce Posts: 5,537
    edited April 10 Vote Up0Vote Down
    kwinn

    Okay... So let's say that I have a file which first has a long and then a byte. In order to get the value of the long, would I just read four bytes and then assemble them somehow?


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • msrobotsmsrobots Posts: 1,775
    edited April 10 Vote Up0Vote Down
    no,
    when you first write a long and then a byte you need also to first read a long and then read a byte.

    Enjoy!

    Mike
    I am just another Code Monkey.
    A determined coder can write COBOL programs in any language. -- Author unknown.
    Press any key to continue, any other key to quit

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • If you're reading / writing on platforms with the same endianness, it's enough to just do this:
      file.writeBytes( &myLong, sizeof(myLong) );
      file.readBytes( &myLong, sizeof(myLong) );
    
    That just writes / reads bytes from the file from/to memory, in the order they're stored in RAM. If your platforms have different endianness, then you'd need to swap the bytes around on one of the platforms, either the before writing to the file, or after reading from the file.

    You can also force a specific order by doing something like this instead:
    void Write32( long value )
    {
      file.writeByte( (value >> 0 ) & 0xff );
      file.writeByte( (value >> 8 ) & 0xff );
      file.writeByte( (value >> 16) & 0xff );
      file.writeByte( (value >> 24) & 0xff );
    }
    
    long Read32(void)
    {
      long value;
      value =  file.readByte() << 0;  // obviously you don't need to shift by zero, I just do this to show consistency
      value |= file.readByte() << 8;
      value |= file.readByte() << 16;
      value |= file.readByte() << 24;
    }
    
    That code always reads & writes the bytes in a specific order, regardless of the endianness of the hardware. I wrote it out to show the ops, but the shifting can easily be done with a for loop.
  • Mike
    when you first write a long and then a byte you need also to first read a long and then read a byte.

    Well, that is what I initially thought, but then kwinn said
    Typically files are stored on disks or sd cards as blocks of bytes

    and so I got confused :)

    Jason

    Thanks for the explanation


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • Even your hard-disk based file system stores data in blocks (pages, clusters, etc), but that part of it is abstracted away from you. The system stores the file length in a directory header. If you write 5 bytes to a file on your hard drive, then close the file, the hard drive allocates a full "page", likely 4Kb or somewhere around there, and writes your data into it. Under the hood, when you read the file back, the system pulls in data a page at a time, and remembers that you only wrote 5 bytes to the file. When you try to read past that 5th byte, even though there's data there, you didn't write it, so the system tells you you're at the end of the file.

    Even lower level than that, the file system stores links between these pages, because they may not be arranged sequentially on the drive. Defragmenting is the act of collecting up these scattered pages and writing them in sequential order so the head of the drive doesn't have to bounce all over the place when reading the data.

    So while Kwinn is correct, for your purposes files can be considered a simple stream of bytes because that's how it's presented to you at the API level. :)
  • idbruce wrote: »
    kwinn

    Okay... So let's say that I have a file which first has a long and then a byte. In order to get the value of the long, would I just read four bytes and then assemble them somehow?

    Yes. Typically the bytes from a long are written to a file sequentially so when the block is read back from the file the bytes should be in the same sequence as they were written in. If the file block was at a specific long aligned starting address in hub ram when it was written and later read back to a long aligned hub ram address all the data in the block would be in the same relative position.
    In science there is no authority. There is only experiment.
    Life is unpredictable. Eat dessert first.
  • Jason
    Even your hard-disk based file system stores data in blocks (pages, clusters, etc), but that part of it is abstracted away from you. The system stores the file length in a directory header. If you write 5 bytes to a file on your hard drive, then close the file, the hard drive allocates a full "page", likely 4Kb or somewhere around there, and writes your data into it. Under the hood, when you read the file back, the system pulls in data a page at a time, and remembers that you only wrote 5 bytes to the file. When you try to read past that 5th byte, even though there's data there, you didn't write it, so the system tells you you're at the end of the file.

    Even lower level than that, the file system stores links between these pages, because they may not be arranged sequentially on the drive. Defragmenting is the act of collecting up these scattered pages and writing them in sequential order so the head of the drive doesn't have to bounce all over the place when reading the data.

    That's quite informative.... I may have asked you this before, and if so, I forgot, but did you get a degree in CS?
    So while Kwinn is correct, for your purposes files can be considered a simple stream of bytes because that's how it's presented to you at the API level.

    Thanks for defragmenting my confusion :) Now it is time to refrag trying to figure out the header :)

    Okay, I waited to respond to see if I could figure this out, but again I am lost. This is what I have so far, but I know it is wrong, because I do not know these things. I should have stuck with strings :) Just kidding :) This version is much better. Anyways, here is what I am thinking, even though it is wrong.
    typedef struct
    {
    	unsigned long value;
    
    	void SetConfig(int config)
    	{
    		value = (value & 0x3FFF) | (config << 30);
    	}
    
    	void SetWidth(int width)
    	{
    		value = (value & 0xC000) | width;
    	}
    
    	void SetHeight(int height)
    	{
    		value = (value & 0xC000) | height;
    	}
    
    } HDRLONG, *LPHDRLONG;
    


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • idbruceidbruce Posts: 5,537
    edited April 10 Vote Up0Vote Down
    And then, here is my WriteBitmapHeader function...
    void CMLNDlg::WriteBitmapHeader(unsigned char Config, unsigned short Width, unsigned short Height)
    {
    	HDRLONG hdr;
    
    	hdr.SetConfig(Config);
    	hdr.SetWidth(Width);
    	hdr.SetHeight(Height);
    
    	outputFile->Write(&hdr.value, sizeof(hdr.value));
    }
    

    As well as the call to the WriteBitmapHeader function...
    				// Write bitmap header for MLN file
    				//////////////////////////////////////////////////////////////////////////////////////////////
    
    				// Prototype
    				// WriteBitmapHeader(unsigned char Config, unsigned short Width, unsigned short Height)
    
    				// Possible configurations values
    				
    				// Config value = 0, 00 - Bottom Image Or Top Image Left Hand Aligned Without Alternating Lines
    				// Config value = 1, 01 - Bottom Image Or Top Image Left Hand Aligned With Alternating Lines
    				// Config value = 2, 10 - Bottom Image Right Hand Aligned Without Alternating Lines
    				// Config value = 3, 11 - Bottom Image Right Hand Aligned With Alternating Lines
    
    				unsigned char config;
    
    				//Bottom Image Or Top Image Left Hand Aligned Without Alternating Lines
    				if((bLeftHandAlternate == FALSE && bMirror == TRUE && bMirrorRightHandAligned == FALSE) ||
    					(bLeftHandAlternate == FALSE && bMirror == FALSE))
    				{
    					config = 0;
    
    				}
    
    				//Bottom Image Or Top Image Left Hand Aligned With Alternating Lines
    				if((bLeftHandAlternate == TRUE && bMirror == TRUE) ||
    					(bLeftHandAlternate == TRUE && bMirror == FALSE))
    				{
    					config = 1;
    				}
    
    				//Bottom Image Right Hand Aligned Without Alternating Lines
    				if(bMirror == TRUE && bMirrorRightHandAligned == TRUE &&
    					bRightHandAlternate == FALSE)
    				{
    					config = 2;
    				}
    
    				//Bottom Image Right Hand Aligned With Alternating Lines
    				if(bMirror == TRUE && bMirrorRightHandAligned == TRUE &&
    					bRightHandAlternate == TRUE)
    				{
    					config = 3;
    				}
    
    				WriteBitmapHeader(config, (unsigned short)bmWidth, (unsigned short)bmHeight);
    				//////////////////////////////////////////////////////////////////////////////////////////////
    

    EDIT: ooops.... I forgot to add a CFile ponter to the function.... Sorry


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

  • You're close : the shifts are correct, but the masks are wrong.

    There's two ways to do this - one would be to remove the masks altogether, and just set the value to 0 when you first make the struct. After that you're just OR-ing in bits. The alternate way is to get the shifts right so you can replace any part of it at any time.

    If your bit layout is this:
     xx_CC_WWWWWWWWWWWWWW_HHHHHHHHHHHHHH  -  2 unused + 2 bits command, + 14 bits width + 14 bits height
    
    ...and you want masks so you can zero out any of those, the "easy" way is this:
      int maskOff = ~( ((1 << NumBits)-1) << BitPosition );
    
    That line of code will generate a 32 bit mask allowing you to AND bits out of an existing value.

    Taking it apart:
    1<< NumBits
    If you have an N bit quantity, you can represent any number from 0 to (1<<N)-1 (or 2 to the power of N, less one). Assume NumBits is 14:
      1 =         00000000_00000000_00000000_00000001 (binary - single 1 bit)
      1 << 14 =   00000000_00000000_01000000_00000000 (binary - that bit shifted to position 14)
      (1<<14)-1 = 00000000_00000000_00111111_11111111 (binary - subtracting 1 turns on all bits UP TO, but not including that bit)
    

    In C, using the ~ operator (bitwise NOT), creates an exact opposite bit pattern. So the next two parts are:
      ((1<<14)-1) << 14   = 00001111_11111111_11000000_00000000 (binary - just the last number above shifted up into the "width" position, starting at bit 14)
     ~(((1<<14)-1) << 14) = 11110000_00000000_00111111_11111111 (binary - the opposite of the previous value)
    

    So now, if you bitwise AND that generated mask with your value, you zero out all the bits with zeros in the mask, and leave alone the other bits. I would normally calculate this once somewhere in the file and just re-use it, either as a const or a define, like this:
    #define Head_Config_BitPos  28
    #define Head_Config_Bits     2
    #define Head_Width_BitPos   14
    #define Head_Width_Bits     14
    #define Head_Height_BitPos   0
    #define Head_Height_Bits    14
    
    #define Make_Mask( Pos, Bits)  (  ~( ((1<<Bits)-1) << Pos) )
    
    #define Head_Config_Mask (Make_Mask( Head_Config_BitPos, Head_Config_Bits) )
    #define Head_Width_Mask  (Make_Mask( Head_Width_BitPos,  Head_Width_Bits) )
    #define Head_Height_Mask (Make_Mask( Head_Height_BitPos, Head_Height_Bits) )
    
    typedef struct
    {
    	unsigned long value;
    
    	void SetConfig(int config)
    	{
    		value = (value & Head_Config_Mask) | (config << Head_Config_BitPos);
    	}
    
    	void SetWidth(int width)
    	{
    		value = (value & Head_Width_Mask) | (width << Head_Width_BitPos);
    	}
    
    	void SetHeight(int height)
    	{
    		value = (value & Head_Height_Mask) | (height << Head_Height_BitPos);
    	}
    
    } HDRLONG, *LPHDRLONG;
    

    That's a really long-winded explanation, but hopefully that helps. Once you know what those magic numbers are, you can just encode them directly into the source, which is what I did before with the 16-bit versions (0x3FFF and 0xC000), but this makes it easier to change them if you want to. I haven't actually compiled this, FYI, so there may be errors. but it should get the ideas across.

    And no, I don't have a degree of any kind. Self-taught to a point, and then learned on the job from a bunch of smart people. I read a lot of other people's code when I'm learning something - find examples and learn from them, troll CodeProject, etc, etc. It partly why I like helping other people - trying to pay back some of that Karmic debt. :)
  • Jason

    I don't know how you can see and read all that, it is all very confusing to me, and it always has been. I just wish I had a good book on this subject alone - "Bit Manipulation For Dummies" :)
    I read a lot of other people's code when I'm learning something - find examples and learn from them, troll CodeProject, etc, etc.

    Yea, that is about the same for what I know.

    That struct and the defines sure looks like something serious is going on there :)

    Anyhow, I truly don't know which way to go. It is not my field of expertise. However, I do know that it will be a set and forget situation, so basically just a write on the PC side and a read on the microcontroller side, or at least I think so. I can't imagine wanting to change the header or the code after it has been written.
    Once you know what those magic numbers are, you can just encode them directly into the source, which is what I did before with the 16-bit versions (0x3FFF and 0xC000)

    Well even if I knew the magic numbers, I would still be unsure of their use. I suppose a good place to start would be the study of the binary numbering system and then the hexidecimal numbering system, both of which are heavily used.

    I suppose if it is possible, I would prefer to have both structs similar. For me, I like the way you just encoded the "magic numbers" directly into the source, but I also understand your point about the possibility of wanting to change things. Just out of curiosity, why would I possibly want to change anything in the bitmap header, except for possibly adding another function or parameter? I suppose your answer will be if I ever need the upper two bits.


    Novel Solutions - http://www.novelsolutionsonline.com/ - Machinery Design • - • Product Development
    "Necessity is the mother of invention." - Author unknown.

Sign In or Register to comment.