If you'd like to understand pack/unpack. There is a tutorial here in perl, that works equally well in understanding it for php:
http://perldoc.perl.org/perlpacktut.html(PHP 4, PHP 5, PHP 7, PHP 8)
pack — 将数据打包成二进制字符串
将输入参数打包成 format 格式的二进制字符串。
这个函数的思想来自 Perl,所有格式化代码(format)的工作原理都与
Perl 相同。 但是,缺少了部分格式代码,比如 Perl 的 “u”。
注意,有符号值和无符号值之间的区别只影响函数 unpack(),在那些使用有符号和无符号格式代码的地方 pack() 函数产生相同的结果。
format
format
字符串由格式代码组成,后面跟着一个可选的重复参数。重复参数可以是一个整数值或者
* 值来重复到输入数据的末尾。对于 a, A, h, H
格式化代码,其后的重复参数指定了给定数据将会被使用几个字符串。对于
@,其后的数字表示放置剩余数据的绝对定位(之前的数据将会被空字符串填充),对于其他所有内容,重复数量指定消耗多少个数据参数并将其打包到生成的二进制字符串中。
目前已实现的格式如下:
| 代码 | 描述 |
|---|---|
| a | 以 NUL 字节填充字符串 |
| A | 以 SPACE(空格) 填充字符串 |
| h | 十六进制字符串,低位在前 |
| H | 十六进制字符串,高位在前 |
| c | 有符号字符 |
| C | 无符号字符 |
| s | 有符号短整型(16位,主机字节序) |
| S | 无符号短整型(16位,主机字节序) |
| n | 无符号短整型(16位,大端字节序) |
| v | 无符号短整型(16位,小端字节序) |
| i | 有符号整型(机器相关大小字节序) |
| I | 无符号整型(机器相关大小字节序) |
| l | 有符号长整型(32位,主机字节序) |
| L | 无符号长整型(32位,主机字节序) |
| N | 无符号长整型(32位,大端字节序) |
| V | 无符号长整型(32位,小端字节序) |
| q | 有符号长长整型(64位,主机字节序) |
| Q | 无符号长长整型(64位,主机字节序) |
| J | 无符号长长整型(64位,大端字节序) |
| P | 无符号长长整型(64位,小端字节序) |
| f | 单精度浮点型(机器相关大小) |
| g | 单精度浮点型(机器相关大小,小端字节序) |
| G | 单精度浮点型(机器相关大小,大端字节序) |
| d | 双精度浮点型(机器相关大小) |
| e | 双精度浮点型(机器相关大小,小端字节序) |
| E | 双精度浮点型(机器相关大小,大端字节序) |
| x | NUL 字节 |
| X | 回退一字节 |
| Z | 以 NUL 结尾(ASCIIZ)字符串,将用 NUL 填充 |
| @ | NUL 填充到绝对位置 |
values
返回包含数据的二进制字符串。
| 版本 | 说明 |
|---|---|
| 8.0.0 |
此函数不再在失败时返回 false。
|
| 7.2.0 | float 和 double 类型支持大端和小端。 |
| 7.0.15,7.1.1 | 添加了 “e”,“E”,“g” 和 “G” 代码以启用 float 和 double 的字节顺序支持。 |
示例 #1 pack() 示例
<?php
$binarydata = pack("nvc*", 0x1234, 0x5678, 65, 66);
?>输出结果为长度为 6 字节的二进制字符串,包含以下序列 0x12, 0x34, 0x78, 0x56, 0x41, 0x42。
格式代码 q、 Q、J 和 P 在 32 位 PHP 构建中不可用。
注意,PHP 内部将以有符号的形式存储 int 值,其大小取决于机器。整数文字和运算的结果超出 int 类型范围,将存储为 float。将这些浮点数打包为整数时,首先会转换为整数类型。这一过程可能不会产生预期的字节模式。
最典型的情况是当要打包那些无符号 int 类型就可以表示的无符号数字。在 int 类型为 32
位的系统中,转换通常会产生与 int 类型为无符号时相同的字节模式(尽管这取决于 C 标准中定义的、由具体实现决定的无符号转有符号规则)。对于
int 类型为 64 位的系统,float 可能无法提供足够的尾数位数来保存值而不会损失精度。如果这些系统也支持本地 64 位 C
int 类型(多数类 UNIX 系统并不支持),那么要在高范围使用 I 打包格式的唯一办法就是创建与预期的无符号值具有相同字节表示的负整数值。
If you'd like to understand pack/unpack. There is a tutorial here in perl, that works equally well in understanding it for php:
http://perldoc.perl.org/perlpacktut.htmlA helper class to convert integer to binary strings and vice versa. Useful for writing and reading integers to / from files or sockets.
<?php
class int_helper
{
public static function int8($i) {
return is_int($i) ? pack("c", $i) : unpack("c", $i)[1];
}
public static function uInt8($i) {
return is_int($i) ? pack("C", $i) : unpack("C", $i)[1];
}
public static function int16($i) {
return is_int($i) ? pack("s", $i) : unpack("s", $i)[1];
}
public static function uInt16($i, $endianness=false) {
$f = is_int($i) ? "pack" : "unpack";
if ($endianness === true) { // big-endian
$i = $f("n", $i);
}
else if ($endianness === false) { // little-endian
$i = $f("v", $i);
}
else if ($endianness === null) { // machine byte order
$i = $f("S", $i);
}
return is_array($i) ? $i[1] : $i;
}
public static function int32($i) {
return is_int($i) ? pack("l", $i) : unpack("l", $i)[1];
}
public static function uInt32($i, $endianness=false) {
$f = is_int($i) ? "pack" : "unpack";
if ($endianness === true) { // big-endian
$i = $f("N", $i);
}
else if ($endianness === false) { // little-endian
$i = $f("V", $i);
}
else if ($endianness === null) { // machine byte order
$i = $f("L", $i);
}
return is_array($i) ? $i[1] : $i;
}
public static function int64($i) {
return is_int($i) ? pack("q", $i) : unpack("q", $i)[1];
}
public static function uInt64($i, $endianness=false) {
$f = is_int($i) ? "pack" : "unpack";
if ($endianness === true) { // big-endian
$i = $f("J", $i);
}
else if ($endianness === false) { // little-endian
$i = $f("P", $i);
}
else if ($endianness === null) { // machine byte order
$i = $f("Q", $i);
}
return is_array($i) ? $i[1] : $i;
}
}
?>
Usage example:
<?php
Header("Content-Type: text/plain");
include("int_helper.php");
echo int_helper::uInt8(0x6b) . PHP_EOL; // k
echo int_helper::uInt8(107) . PHP_EOL; // k
echo int_helper::uInt8("\x6b") . PHP_EOL . PHP_EOL; // 107
echo int_helper::uInt16(4101) . PHP_EOL; // \x05\x10
echo int_helper::uInt16("\x05\x10") . PHP_EOL; // 4101
echo int_helper::uInt16("\x05\x10", true) . PHP_EOL . PHP_EOL; // 1296
echo int_helper::uInt32(2147483647) . PHP_EOL; // \xff\xff\xff\x7f
echo int_helper::uInt32("\xff\xff\xff\x7f") . PHP_EOL . PHP_EOL; // 2147483647
// Note: Test this with 64-bit build of PHP
echo int_helper::uInt64(9223372036854775807) . PHP_EOL; // \xff\xff\xff\xff\xff\xff\xff\x7f
echo int_helper::uInt64("\xff\xff\xff\xff\xff\xff\xff\x7f") . PHP_EOL . PHP_EOL; // 9223372036854775807
?>Note that the the upper command in perl looks like this:
$binarydata = pack ("n v c*", 0x1234, 0x5678, 65, 66);
In PHP it seems that no whitespaces are allowed in the first parameter. So if you want to convert your pack command from perl -> PHP, don't forget to remove the whitespaces!If you need to unpack a signed short from big-endian or little-endian specifically, instead of machine-byte-order, you need only unpack it as the unsigned form, and then if the result is >= 2^15, subtract 2^16 from it.
And example would be:
<?php
$foo = unpack("n", $signedbigendianshort);
$foo = $foo[1];
if($foo >= pow(2, 15)) $foo -= pow(2, 16);
?>/* Convert float from HostOrder to Network Order */
function FToN( $val )
{
$a = unpack("I",pack( "f",$val ));
return pack("N",$a[1] );
}
/* Convert float from Network Order to HostOrder */
function NToF($val )
{
$a = unpack("N",$val);
$b = unpack("f",pack( "I",$a[1]));
return $b[1];
}Even though in a 64-bit architecure intval(6123456789) = 6123456789, and sprintf('%b', 5000000000) = 100101010000001011111001000000000
pack will not treat anything passed to it as 64-bit. If you want to pack a 64-bit integer:
<?php
$big = 5000000000;
$left = 0xffffffff00000000;
$right = 0x00000000ffffffff;
$l = ($big & $left) >>32;
$r = $big & $right;
$good = pack('NN', $l, $r);
$urlsafe = str_replace(array('+','/'), array('-','_'), base64_encode($good));
//done!
//rebuild:
$unurl = str_replace(array('-','_'), array('+','/'), $urlsafe);
$binary = base64_decode($unurl);
$set = unpack('N2', $tmp);
print_r($set);
$original = $set[1] << 32 | $set[2];
echo $original, "\\r\\n";
?>
results in:
Array
(
[1] => 1
[2] => 705032704
)
5000000000
but ONLY on a 64-bit enabled machine and PHP distro.You will get the same effect with
<?php
function _readInt($fp)
{
return unpack('V', fread($fp, 4));
}
?>
or unpack('N', ...) for big-endianness.Be aware of format code H always padding the 0 for byte-alignment to the right (for odd count of nibbles).
So pack("H", "7") results in 0x70 (ASCII character 'p') and not in 0x07 (BELL character)
as well as pack("H*", "347") results in 0x34 ('4') and 0x70 ('p') and not 0x03 and 0x47.Sorry, but i use AI ;-)
Was talking about memory optimisation and performance with Google Gemini. Here is a nice axample for the \pack function. (And using `\pack` and not `pack` is also performance related)
If your build tool generates giant in-memory lookup data metrics (like IP routing zones, geo-location grids, or localized translation indices), do not store them as standard multidimensional PHP arrays. A PHP array bucket requires massive zval and tracking hash overhead.
Instead, pack the data into raw binary sequences using pack() and look it up via byte offsets with substr().
The Memory Comparison:
Imagine storing 50,000 coordinate status IDs.
PHP
// ❌ Array Allocation: Takes ~6 Megabytes of RAM
$data = [10023, 10024, 10025, ...];
// 🚀 Packed Binary String: Takes ~200 Kilobytes of RAM (30x less memory)
$packedData = \pack('N*', 10023, 10024, 10025);
How you execute an O(1) read:
Because you packed the integers using the N format (unsigned 32-bit big-endian integers), you know with absolute mathematical certainty that every single number occupies exactly 4 bytes of space inside that string.
To read index number 5,000, you don't map arrays. You calculate the direct byte offset instantly:
PHP
// Direct memory offset extraction via string pointer shifting
$byteOffset = 5000 * 4;
$binarySegment = \substr($packedData, $byteOffset, 4);
// Unpack back to an integer instantly
$unpacked = \unpack('Nid', $binarySegment);
$id = $unpacked['id'];
To the Zend Engine, a string is just a flat, contiguous vector of memory. By using binary packed strings for deep lookup matrices, you bypass the entire zval architecture. You store raw numbers directly beside each other in system RAM, making your application footprint incredibly light and keeping the entire data structure small enough to fit inside the CPU's high-speed L2 or L3 cache lines.pack()
h Hex string, low nibble first (not same hex2bin())
H Hex string, high nibble first (same hex2bin())Using pack to write Arabic char(s) to a file.
<?php
$text = "㔆㘆㘆";
$text = mb_convert_encoding($text, "UCS-2BE", "HTML-ENTITIES");
$len = mb_strlen($text);
$bom = mb_convert_encoding("", "unicode", "HTML-ENTITIES");
$fp = fopen('text.txt', 'w');
fwrite($fp, pack('a2', $bom));
fwrite($fp, pack("a{$len}", $text));
fwrite($fp, pack('a2', $bom));
fwrite($fp, pack('a2', "\n"));
fclose($fp);
?>