Binary search
These functions perform binary search or otherwise inspect the contents of the input file. All of them expect the current file position to be aligned on a byte boundary and throw an error if this is not the case. Use Align(1) in order to perform such alignment.
FoundAt = FindMasked8(Find[, Mask=0xff]) FoundAt = FindMasked16(Find[, Mask=0xffff[, SwapBytes=false]]) FoundAt = FindMasked32(Find[, Mask = 0xffffffff[, SwapBytes=false]]) Len, FoundZero = AsciizStringLength() Len, FoundZero = UnicodeStringLength()
FoundAt = FindMasked8(Find[, Mask])
This function lets you find the requested byte (the Find parameter) in the input file. If the byte is found, the function returns the position at which it is located. Otherwise, it returns nil. If you pass a number greater than 255 as the Find parameter, only the LSB of this number will be used for search. The optional Mask parameter lets you define a mask that is applied (ANDed) to each byte of the input file before comparing it with the Find value. If this parameter is omitted, no masking is performed (that is, Mask is equal to 0xff). Again, if you pass a number greater than 255 as the Mask parameter, only its LSB will be used. Here is an example how to find a byte that is either 0x00 or 0x01:
-- find either 0x00 or 0x01 local FoundAt = FindMasked(0x00, 0xfe) if FoundAt then -- we've found it! do something here else error("Couldn't find 0x00 or 0x01") end
FoundAt = FindMasked16(Find[, Mask[, SwapBytes]])
This function lets you find the requested 16-bit word (the Find parameter) in the input file. If the word is found, the function returns the position at which it is located. Otherwise, it returns nil. If you pass a number greater than 0xffff as the Find parameter, only the LSW of this number will be used for search. The optional Mask parameter lets you define a mask that is applied (ANDed) to each word of the input file before comparing it with the Find value. If this parameter is omitted, no masking is performed (that is, Mask is equal to 0xffff). Again, if you pass a number greater than 0xffff as the Mask parameter, only its LSW will be used.
The function performs detailed search. This means that after checking a word at address Addr next it will check the word at the address Addr+1.
By default, the function performs the search using the Intel byte order (LSB comes first in file). However, you can set the optional boolean SwapBytes parameter to true in order to use the reverse byte order (MSB comes first). When SwapBytes is true, then before doing anything else the function internally swaps all bytes in Find and Mask parameters and then continues normally.
FoundAt = FindMasked32(Find[, Mask[, SwapBytes]])
This function is almost identical to the FindMasked16() with the only exception that it operates with 32-bit wide words. Here is a real life example from MPEG template that searches MPEG bitstream for the next start code (00 00 01 xx). Note how setting the SwapBytes parameter to true lets us use natural numeric representation of the start_code_prefix and corresponding mask constants.
-- setup bit and byte orders
MPEGBitOrder()
Motorola()
-- MPEG definitions
start_code_prefix = 0x00000100
-- find next start code. Returns address of found 0x000001xx and xx or nil.
-- Note that start codes in MPEG are byte-aligned
function FindNextStartCode()
Align(1)
return FindMasked32(start_code_prefix, 0xffffff00, true)
end
Len, FoundZero = AsciizStringLength()
This function scans the input file for a zero-terminated ASCII string. It returns two values. The first one is the length of the string in characters (without the terminating zero). The second tells if the terminating zero was found (true) or not (false). In the latter case when there is no terminating zero the first returned value is equal to the number of bytes till the end of the input file.
Len, FoundZero = UnicodeStringLength()
This is almost the same as the AsciizStringLength() with the only exception that this function operates with Unicode characters (each character is two bytes). Please note that unlike the FindMasked16() function this function does not perform detailed search. This means that when a Unicode character at address Addr is not the Unicode zero (00 00), it skips the whole Unicode character and checks the next Unicode character at address Addr+2.
Important: due to historical reasons this function returns length of string in bytes, not in Unicode characters. However, this length is always even, even in the case when there is no terminating zero and the number of bytes till EOF is odd.