Embroidery format: Difference between revisions

The educational technology and digital learning wiki
Jump to navigation Jump to search
 
(35 intermediate revisions by 3 users not shown)
Line 10: Line 10:
Embroidery format PEC‎‎,
Embroidery format PEC‎‎,
Embroidery format PES‎,
Embroidery format PES‎,
Embroidery format U??,
Embroidery format VP3,
Embroidery format VP3,
Stitch format PMV
|cat_syllabus=Embroidery format
|cat_syllabus=Embroidery format
}}
}}
Line 17: Line 19:
There are several kinds of Embroidery file formats, and each contains different abilities and features. Some formats have different versions with increased features. Usually these try to preserve backwards compatibility because of the cost of the hardware in question.  
There are several kinds of Embroidery file formats, and each contains different abilities and features. Some formats have different versions with increased features. Usually these try to preserve backwards compatibility because of the cost of the hardware in question.  


This page attempts to provide a short overview on embroidery formats. For technical details, see the specialized articles (menu to the right).
This page attempts to provide a short overview on embroidery formats. For technical details, see the specialized articles (menu to the right). Any technical details in the article are for understanding why the formats are as they are.


== Kinds of Embroidery File Types ==


* All sorts of 2D bitmap and vector formats for the drawings, i.e. formats that are not specifically made for embroidery
== The Intended Functions ==
* Embroidery file formats that work on a range of machines and also can be used as exchange formats. These are sometimes called '''commercial''' formats.
* Embroidery file formats that are mostly brand or even type specific
* Both so-called commercial file formats and more brand-specific formats come in two forms: Some '''only have stitching''' information, others keep '''information that makes them''' easily editable. The latter could be called '''worker''' files.


It seems that there are about 30 different formats. It is not clear to me what different formats can do. Also, I don't know ''how'' formats are supported by various vendors. Some formats seem to be barely editable since they only contain stitching instructions like "go to x/y" and "add a stitch from x1/1y to x2/y2" or "change thread". Others may include precise information about the shape and kind of a design part and keep stitching information apart, i.e. an area is not just defined in terms of stitches. The latter are more easily editable. Other formats (like JEF) may keep just some information, e.g. colors.
A lot of embroidery confusion is because there are diverse sets of hardware, and software and it is an unfathomable muddle. We can however properly classify the problem. The most significant differences between embroidery files is the divide between intending to run a machine, and those intended to run the programs that create the files. It is like the difference between a complex layered photoshop file with undo history etc, and a jpeg. While you could call them both "image files" it's a massive distinction. We will divide them up to avoid this confusion.


    The stitching information is actually less informative than that. It actually means move x, y, and stitch or move x, y and don't stitch, or stitch then move x, y. The machine is stitching whether you move or not. But, whether the stitch hits after the movement or before actually differs by format and machine. Stitch doesn't mean put thread between points a and b. It means go somewhere before the needle hits. Just as jump can be taken as do what stitch does but also block that needle bar.
== Machine Embroidery Files ==


It is sometimes difficult to find out what formats a specific machine from various Brands can read. E.g. Bernina's feature their own brand-specific editable *.art formats, but it seems that the high-end machines directly can read *.exp which is a commercial  format, if I understood right. When I bought an Elna 8300, '''no''' information about formats was included in the documentation (or I couldn't find it) ...
Often the actual controllers within the machines are similar and the encoding scheme for the embroidery files turns out to be quite similar. They are intended to control a stepper or servo motor, and a x-y plotter and a needle head. They don't do much beyond that. The stitch portions of files encode for three things: a control command, a dx, and dy. This encoding is also heavily influenced by the actual physical requirement of the machine. The machines must stitch, unless the needle bar is blocked, which is the difference between a stitch and a jump. When enough of these are in a row, machines will often force a trim. It's possible with some machines to block the needle bar and move very far from the last stitch, others will force a trim. There are some cases like fringe where this is needed or puffy applique. Given this requirement, the commands used within the machine code is often similar, sometimes identical when you get away from the header and into the actual stitches within the file.


To make the situation worse, some formats have different subtypes. E.g. the popular .PES comes in ''eight'' (actually it's no fewer than 12, but mostly settings in the program that edits them, there's only the 1 type of worker information within them) different versions :( - I once thought that the situation was really bad for video codecs, video containers or 3D vector formats, but embroidery beats anything else I am aware of in terms of obscurity and diversity.
The header information is often of a specific length and contains a lot of display data for the machines. These are quite often specifically intended for exactly that display module. But, it also means that some formats like .DST, .DSB, .DSZ all have the same header information even though the stitch data is entirely different. These different encoding schemes result in entirely new file types while not doing anything new.


Vendors include conversion software that can translate to their (and other's) machine readable CNC formats from a series of other low-level and also from more high-level formats. The most popular exchange formats seem to be DST and EXP, but these are not necessarily the best. As little as we know of today (after few hours spent on exploration) a good format (e.g. EMB) includes a vector description of each design part and attaches abstract stitching information to that object. This way it can be transformed without deforming stitches. Less powerful formats are directly stitch-based. The most popular rather machine-specific format seems to be *.pes (Brother) since it also includes worker information.
To help matters, or rather make them worse, some companies like Wilcom will make new file types that just contain the stitch data. For example .t01 is just .dst encoded stitches without a header. So is .tap. Literally these formats are identical and interchangeable, and you could create them by deleting the first 512 bytes from a dst file. Wilcom also has .t?? with other numbers that refer to other encodings.  


Different machines will have their own format specifically because the format consists of commands intended only for the particular machine in question. These are specific ways that that particular hardware interacts. This is akin to having a different filetype for each executable for each different type of processor. Which is entirely what these files are intended as, executables.
=== Interchange Formats ===
One consequence of the need to have simple machine readable formats is that when you cut away the headers, the formats can be used as interchange formats. Embroidery machines do only a few things and those need to be easily encode in ways that the hardware can understand and execute. This means these formats are pretty easy to parse in binary and consequently access the stitch data therein. The ubiquity of .dst files is largely due to the age of the format but also because many formats are derived from them, and the early hardware was from the company that produced them.


=== Machine Encoding ===
=== Machine Encoding ===
 
The encoding of the format for the computer varies from company to company and software maker to software maker, but often share a common history. For example Pulse developed the early software for Brother called PG1, it's also telling that the PEC blocks within Brother software just like in .dst files have a 512 block header starting with a label "LA: (NAME)" In 1982, Pulse Microsystems also developed the first PC software Stitchworks. The structure of the files is likely due to Pulse as well as Tajima's dominance in the industry in the 1990s.
The most significant differences between embroidery files with the intent of running a machine, and those intended to run the programs that make those files. Often the actual controllers within the machines are similar and the encoding scheme for the embroidery files turns out to be quite similar. They are intended to control a stepper or servo motor, and a x-y plotter and a needle head. They don't do much beyond that. A lot of files simply encode for three things: a control command, an dx, and dy. This encoding is also heavily influenced by the actual physical requirement of the machine. The machines must stitch, unless the needle bar is blocked, which is the difference between a stitch and a jump. When enough of these are in a row, machines will often force a trim. It's possible with some machines to block the needle bar and move very far from the last stitch, others will force a trim. There are some cases like fringe where this is needed or puffy applique. Given this requirement, the commands used within the machine code is often similar sometimes identical when you get away from the header and into the actual stitches within the file.


==== Command Encoding Schemes ====
==== Command Encoding Schemes ====


# Triplet, Tajima Punch-Card DST - triplet code generally with the +x and +y and control bits combined in a particular fashion together. With + and - values for the bits. The important control bits being located on the 3rd bit with the lowest two values of the control bit always set.
{| class="wikitable"
# Pairs, Signed X, Y with 0x80 triggered control events. 2 byte stitch, 4 byte controlled. If the X value is set to 0x80 aka -127 this invokes control form. Where the next byte (which would have been "Y") serves as a command usually with 01 being stop, 04 being jump. 0x10 being end. And the next two bytes are often controlled and can have positional characteristics so things like move.
|-
# Triplet, Unsigned X, Y, Control. The control byte provides the direction for the plotter so it has the sign. The first bit of the command bit is always set. The next two bits control the sign of the x and y values. Usually the last bit controls whether that command is stitched or not (jump vs. stitch), block the needle bar or don't.
! Code Length !! Company Associated !! Structure !! bytes !! Encoding
# Pairs (Varies), Signed 7 or 12 + Control Nibble. 2, 3, 4 byte encoding. Each value of the X and Y is read individually. When the highest bit is set (0x80) it triggers long mode. It means that the top 4 bits are control (the first one being used to trigger long mode). And the bottom nibble is appended to the next byte in the stream. So if the highest bit isn't set it's a 7 bit signed number. If the highest bit is set then control bits may come into effect and the number becomes a 12 bit signed number. This allows not only trigger control events but allowing an optional long mode. Only PEC blocks use this.
|-
| 3 || Pulse, Tajima, Eltac  || 3 byte swizzled bits, punchcard || yyyyxxxx yyyyxxxx ccyyxx11 || Triplet code has control bits on the final byte. Last 2 bits of that byte are always set.  
|-
| 2 (4) || Melco || signed(x,y), x=0x80 control || xxxxxxxx yyyyyyyy ||Series of move locations with control bits being triggered by 0x80 (-127). 0x01 stop, 0x04 jump, 0x10 end)
|-
| 3 || Barudan || unsigned(x,y) control byte ||  xxxxxxxx yyyyyyyy 1cccccccc || The control byte provides direction(x,y) aka the sign.Last bit controls jump vs. stitch.  
|-
| 3 || Foxtron || control byte, unsigned(y,x) ||  1cccccccc yyyyyyyy xxxxxxxx || This is very similar to the previous but differs in the endness of the triplet code.
|-
| 2 (3?, 4) || Brother || signed 7 bit (x,y) || cxxxxxxx cyyyyyyy || if the high bit is set, it becomes a 12 bit signed number. And the top 4 bits in that nibble (the first triggering the event), are control flags.
|-
| 2 (4) || Compucon, Singer || signed(x,y), x == 0x7F or 0x7D || xxxxxxxx yyyyyyyy || The 0x7F triggers a control event, whereas 0x7D triggers long form. Read x and y as 2 sInt16.
|-
| 2 || KSM || unsigned(x,y) control byte ||  xxxxxxxx yyyyyyyy 1cccccccc || Controls the restart of stitching after a stop. Before restart of stitching, commands are unstitched but identical to stitches.
|}


There are minor variations of these schemes sometimes with different endian methods and order of the bytes. Some like Xxx use 0x7F as as control, and 0x7D as long mode which means the X and Y in that case are 16 bit signed integers and read right after the 0x7D. And KSM uses triplet: x,y,control triggered control but permits non stitching x,y locations until stitching is re-enabled with a different control event.
The DST (Pulse, Tajima, Eltac) punchcard swizzle is the oddest and oldest of these. I cannot help imagine the hardware of the reader. The high bits control Y and low-bits control X. One could imaging different mechanism reading different parts of the card and controlling the x or y part of the plotter. Reading from bottom to top, you would get control bits, and the biggest movement, depending on whether there's a hole there and in the direction of the hole. With the next byte having a third of that impact, and a third more each byte you go up. It would also explain things like why in the third byte it always ends 11 ??????11. Namely, if you turned the physical card around it would be in the position of the control bits. 00000011 is the standard no-op card for that but 11000000 would cause c1 and c2 to set and therefore STOP. It might be nice to look at some precomputered punch-card of this encoding. But, it would detect a backwards card as stop, with those set bits. [[Computerized embroidery#History|Welcome to the 21st century!]]


=== Vector Files ===
{| class="wikitable"
 
|+ DST Encoding
At the other extreme there are vector files that serve to create embroidery files by having all by storing all the data needed to create an embroidery. This means having the vector shapes and fill types, and offsets and motifs, and which order these are located the start and stop location. And then generating a bunch of line segments through a variety of algorithms. These are then set into the machine readable formats for the machines to read and follow the set commands.
! BYTE !! 7 !! 6 !! 5 !! 4 !! 3 !! 2 !! 1 !! 0
 
|-
=== Hybrid Machine/Vector Files ===
| 1 || y+1 || y-1 || y+9 || y-9 || x-9 || x+9 || x-1 || x+1
 
|-
Some files like PES actually have both of these. They contain a pointer to a PEC block that is entirely intended for brother embroidery machines to use. These blocks also exist in PEC files that simply say #PEC0001 and then have the PEC block, or within PES files or PHB and PHC files that equally also contain a PEC block.
| 2 || y+3 || y-3 || y+27 || y-27 || x-27 || x+27 || x-3 || x+3
 
|-
== Type and Structure of Embroidery Files ==
| 3 || c0 || c1 || y+81 || y-81 || x-81 || x+81 || set || set
* .10o (Toyota) contains unsigned(x,y,control) style encoding, control byte, x, y. A separate .00o contain color information for the Toyota machine format.
|-
* .100 (Toyota) contains 4 byte encoding. With two bytes of control bytes with the 3rd and 4th byte signed x, y locations.
|}
* .art file, contains a Compound File Binary Format, of a series of files. They have different classes of file according to whether it contains the design information. Different files within the format contain the summary information, the Design Information, contents (the compressed stitch data, zlib 4 bytes in), the Design Icon, a bitmap of the what the design should look like. They classify each file with regard to the amount of information is in the .art file. So having the contents means it can sew, but without the design information, edits would only be possible at the stitch information.
* .bro (Bits & Volts), 256 bytes of header. 0x80 triggers control event, 2 stop, 3 jump, otherwise stitch x, y
* .dat (Barudan) contains 256 bytes of header, Triplet code signed(x) signed(y) and control code.
* .dsb (Tajima for Barudan) contains identical header to .dst files. However it's encoded unsigned(x,y,control) style.
* [[Embroidery format DST|.dst (Tajima) (See detailed article)]] contains a header of 512 bytes with design information statistics. DST encoded direct commands.
* .dsz (Tajima for ZSK USA) contains 512 bytes of dst header information. Then unsigned(x,y,control) style encoding. However the bytes are weirdly ordered going y, x, control. It also specifies the needle for the stop, using 4 bits to encode various values therein.
* .edr (Embird) this is fully fledged vector encoding data for Embird software.
* .emb (Wilcom) this is a fully fledged vector format stored very similarly to .ART and clearly share a code base. Several elements stored via a byte replacement cypher in a zlib compressed stream of data after a file size.
* .emd (Elna) 48 bytes of header (0x30 in hex) followed by 0x80 controlled x, y series. With 0x80 0x2A being stop/color_change 0x80 being Jump. and ending with 0x80 0xFD.
* [[Embroidery format EXP|.exp (Expanded Melco) (See detailed article)]] this is exp encoded data without a header. X, Y in signed values. If X is 0x80 (-127) it triggers a control event and those 2 bytes are control values. The next two values also apply to the control. So a color change 0x80 0x01 is followed directly by 0x00 0x00 and a jump 0x80 0x04 actually goes to the X, Y position instructed, but must be repeated each new command.
* .exy (Eltac) 256 bytes of header. Triplet coded in DST encoding.
* .fxy (Forton) unsigned(x,y,control) style encoding, 256 bytes of header
* .gt (Golden Thread) unsigned(x,y,control) style encoding with 512 bytes of header.
* .hus (Husqvarna Viking) Compressed bytes. Using a small table Arc compression.
* .inb (Inbro) (unverified) 8192 of header. Followed by unsigned(x,y,control) style encoding.
* .jef (Janome) Header information, magic-number thread lookup, and 0x80 triggered control events.
* .ksm (Pfaff) 512 bytes of header. y, x, control triplets. However, it doesn't force that encoding on all jumps. Rather after triggering a color change to a specific needle, it simply gives normal encoded x, y location (indistinguishable from stitches) until control bits of 0x19 triggers sewing again.
* .max,  (Pfaff)
* .mit, (Mitsubishi)
* ".new, (Ameco)
* ".ofm, (Melco)
* ".pcd, (Pfaff)
* ".pcm, (Pfaff)
* ".pcq, (Pfaff)
* ".pcs, (Pfaff)
* [[Embroidery format PEC|.pec (Brother, Babylock) (see detailed article)]]header of #PEC0001 then a pec block. This encodes not just magic number colors, but also graphics which are displayed on the machines themselves. And contain high bit long form + control triggering.
* [[Embroidery format PES|.pes (Brother, Babylock) (see detailed article)]], contains several different layers of information. After the header #PES00XX which determines the version of the file, it contains the position in the file containing the PEC block. The PEC block is information intended for the machine. This some information about the design, name, number of stitches, size, location of graphics information, followed by blanks equalling up to 512 bytes. And a series of direct commands for the design. This is followed by 1 bit graphic bitmaps. All of which are intended for the machine to run. Regardless of the version of the file. The program simply needs to read the location of the pec block, jump forward 512 bytes and read the direct commands. Or jump forward 22 bytes, read the graphics location, and then seek to that location and read the graphics to be displayed on the embroidery machine's screen. Different versions contain different information in the various blocks within the file that are jumped over. These include vector information and design specific instructions that are able to rebuild the stitch data from scratch. So an alteration can allow regeneration of the stitch data.
* .phb, (Brother) Header, bunch of info and a PEC block.
* .phc, (Brother) Header, bunch of info and a PEC block.
* .sew (Janome) magic number thread lookups. Signed x, y with 0x80 triggered control events.
* .shv (Husqvarna Viking) Big old giant 1 bit graphic, of varying size. Magic color index threads, duplex code 0x80 controlled events, with predefined length of stitching events before color switching.
* .sst (Sunstar)
* .stx (Data Stitch) Full thread data,
* .t01 (Pfaff) contains DST stitches with no header.
* .t09 (Pfaff)
* .tap (Happy) contains DST stitches with no header.
* .thr (ThreadWorks) full vector embroidery format for threadworks software.
* .vip (Pfaff) Compressed stitches. Like Hus it uses Arc table compression.
* [[Embroidery format PEC|.vp3 (Pfaff) (See detailed article)]] unlike most encoding schemes Pfaff files are stored in designs, blocks, stitch blocks allowing multiple blocks and designs within the file structure (see article on specifics) as such the stitches are encoded in several places with seek values to the next relevant set of data and 0x80 triggered encoding that seems to have very few commands, namely 0x02 which means long form (it may just mean trim, and jump, as Pfaff's software seems to do that) and 0x01 which ends long form. And 0x03 that is seen at the end of the final block.
* .u?? aka .u00, .u01 (Barudan). 512 bytes of header. unsigned(x,y,control) encoded stitches in control, y, x (big endian) form. Can be made to do sequins. Likely any form that can specify needles can be made to do sequins.
* .xxx (Singer)
* .zsk (ZSK USA)
 
=== Misc Other Embroidery Data ===
* .INF, contains only color information like a thread chart.
 
Embroidery files are used both for stitching and editing. They need to be read by the machine doing the embroidery to process the series of commands. Since most machine embroidery is rendered from shapes and fills applied to those shapes, saving only data needed to stitch would be lossy. So many formats have a hybrid of this and store easy to read stitch data and higher level objects sometimes protected with encryption and compressed (.hus, .art, .emb). With the higher level the embroidery program can reproduce the lower level stitch commands. For most programs that read this data, they often have their own higher level objects and can read only the stitch data from other formats. When they also write these formats, they very often produce the minimum acceptable version of the file that will not crash the program reading the file. So converting from Wilcom's emb to PES will produce a PES with only stitches even if the Wilcom had access to the higher level objects and the saved version of .pes also those forms available.


=== Machine Embroidery Files ===


{| class="wikitable"  
{| class="wikitable"  
|+ Embroidery file formats
|+ Embroidery file formats
! extension !! Machine manufacturer !! software range !! Contents
! .ext !! Manufacturer(s) !! Structure
|-
| .10o  || Toyota ||  unsigned(x,y,control) encoded stitch, a separate .00o contain color
|-
| .100  || Toyota ||  4 byte encoding. 2 bytes of control bytes, x,y with 3rd, 4th signed x, y locations.
|-
| .bro  || Bits & Volts ||  256 byte  header. x == 0x80 control encoding, 2 stop, 3 jump
|-
| .csd || Singer, POEM|| brand-specific
|-
| .dat  || Barudan ||  256 byte header, unsigned(x,y,control)
|-
| .dsb  || Data Stitch Barudan ||  512 byte DST header but stitches are unsigned(x,y,control) style
|-
| [[Embroidery format DST|.dst]]  || Data Stitch Tajima ||    [[Embroidery format DST|(See detailed article)]] 512 byte header, DST encoded direct commands.
|-
| .dsz  || Data Stitch ZSK_USA || 512 byte DST header, big-endian unsigned(x,y,control) style encoding, with specified 4 bit needle
|-
| .emd  || Elna ||  48 (0x30) byte header, x == 0x80 controlled encoding: 0x2A stop/color_change, 0x80 Jump, end 0xFD.
|-
| [[Embroidery format EXP|.exp]]  || Melco, Bernina (high-end models) || [[Embroidery format EXP|Expanded Melco (See detailed article)]] X == 0x80 controlled encoding. Color change, 0x01 is followed directly by 0x00 0x00 and a jump 0x80 0x04 uses the following the X, Y position instructed, but must be repeated each new command.
|-
| .e??  || Eltac || 256 bytes of header, Triplet coded in DST encoding.
|-
| .f??  || Forton || 256 byte header, unsigned(x,y,control) style encoding
|-
| .gt  || Golden Thread || 512 bytes of header, unsigned(x,y,control) style encoding
|-
| [[Embroidery format HUS|.hus]]  || Husqvarna Viking || Compressed bytes. Using a small table arj compression. The belief is that it's from a defunct compression library ArchiveLib by a defunct company GreenLeaf Software using mode "AL_GREENLEAF_LEVEL_4" which it licensed from Robert Jung who wrote ARJ. Consequently the compression shares many of the same attributes.
|-
| .inb  || Inbro || 8192 byte header, unsigned(x,y,control) style encoding
|-
| [[Embroidery format JEF|.jef]]  || Janome || (See specific article) stitch + color, Header information, magic-number thread lookup, and 0x80 triggered control events.
|-
| .ksm  || Pfaff ||  512 byte header. unsigned(x,y,control) triplets. However, it doesn't force that encoding on all jumps. Rather after triggering a color change to a specific needle, it simply gives normal encoded x, y location (indistinguishable from stitches) until control bits of 0x17, 0x18, or 0x19 triggers sewing again.
|-
| .max  || Pfaff || 
|-
| .mit || Mitsubishi || 
|-
| .new || Ameco || 
|-
| .ofm || Melco ||  Compound Binary format with some stitches in there.
|-
| .pcd || Pfaff ||  these actually have a weird encoding scheme. Using absolute positioning rather than relative positioning.
|-
| .pcm || Pfaff ||    stitch
|-
| .pcq || Pfaff ||  stitch
|-
| .pcs || Pfaff ||  stitch
|-
| [[Embroidery format PEC|.pec]]  || Brother, Babylock || [[Embroidery format PEC| (see detailed article)]] colors, stitch, 1 bit graphics, header of #PEC0001, pec block, magic number colors, graphics which are displayed on the machines, contain high-bit long-form + control triggering.
|-
| .pen || Brother - Disney || Graphics files and Encrypted Stitches. Apparently this was a thing.
|-
| .phb,  || Brother || Header, bunch of info and a PEC block.
|-
|-
| ART || Bernina|| brand-specific || vectors, icon, colors, stitch
| .phc,  || Brother || Header, bunch of info and a PEC block.
|-
|-
| CSD || Singer, POEM|| brand-specific ||
| .sew  || Janome, Elna, Kenmore ||   magic number thread lookups. Signed x, y with 0x80 triggered control events.
|-
|-
| [[Embroidery format DST|DST]] || Tajima|| most programs || stitch
| .shv  || Husqvarna Viking || stitch, Big old giant 1 bit graphic, of varying size. Magic number colors, x==0x80 controlled events. Predefined length for stitching events before color switching.
|-
|-
| DSG || Sierra|| Stitch Era software || worker + stitch
| .sst  || Sunstar ||  
|-
|-
| EMB || Wilcom|| most high-end programs || vectors, icon, colors, stitch
| .t01  || Wilcom || contains DST stitches with no header.
|-
|-
| [[Embroidery format EXP|EXP]] || Melco, Bernina (high-end models)|| most programs || stitch
| .t09  || Wilcom || contain Pfaff data.
|-
|-
| FDR || Barudan|| ? || ?
| .tap  || Happy || contains DST stitches with no header.
|-
|-
| HUS || Husqvarna Viking|| brand-specific, many programs || stitch
| .vip  ||Pfaff (older), Husqvarna || stitch, Compressed stitches. Like Hus it uses arj-like compression.
|-
|-
| [[Embroidery format JEF|JEF]] || Janome, Elna|| brand-specific, many programs || stitch + color
| [[Embroidery format VP3|.vp3]] || Pfaff (newer) || [[Embroidery format VP3|(See detailed article)]] unlike most encoding schemes Pfaff files are stored in designs, blocks, stitch blocks allowing multiple blocks and designs within the file structure (see article on specifics) stitches are encoded in several places with seek values to the next relevant set of data and x==0x80 triggered encoding with few commands, namely 0x02 which means long form (it may just mean trim, and jump, as Pfaff's software seems to do that) and 0x01 which ends long form. And 0x03 that is seen at the end of the final block.
|-
|-
| PCQ,PCD,PCM, PCS || Pfaff || Brand-specific || stitch
| [[Embroidery format U??|.u??]] || Barudan ||  [[Embroidery format U??|(See detailed article)]] Barudan calls this FDR. 512 byte header. big-endian unsigned(x,y,control) encoded stitches in control.
|-
| .xxx  ||  Singer, Compucon || stitch
|-
| .zsk  || ZSK USA ||
|-
|}
 
The x?? usually refers to file types with progressing numbers so .u00 .u01, etc, where it starts with u but then has some numbers that change by version.
 
==== Other Related Formats ====
 
{| class="wikitable"
|+ Related file formats
! .ext !! Manufacturer(s) !! Structure
|-
|-
| [[Embroidery format PEC|PEC]] || Bernina ?|| brand-specific || colors, stitch, 1 bit graphics.
| [[Stitch format PMV| .pmv]] || Brother || [[Stitch format PMV| (see detailed article)]] Stitch format, 5 and 6 bit hybrid relative formatting. 5 bits of absolute location in presser foot, 6 bits of forward and back along sew path.
|-
|-
| [[Embroidery format PES|PES]] || Brother || popular, most programs || vectors, colors, (PEC File)
|}
 
== High Level Embroidery Files ==
 
At the other extreme there are vector files that serve to create embroidery files by having all by storing all the data needed to create an embroidery. This means having the vector shapes and fill types, and offsets and motifs, and which order these are located the start and stop location. And then generating a bunch of line segments through a variety of algorithms. These are then set into the machine readable formats for the machines to read and follow the set commands.
 
=== Hybrid Machine/Vector Files ===
 
Many of these types have both high and low level to run on machines. This is because they can encode this data without ill effect. For example PES contain a pointer to a PEC block that is entirely intended for brother embroidery machines to use. These blocks also exist in PEC files that simply say #PEC0001 and then have the PEC block, or within PES files or PHB and PHC files which also contain a PEC block. EMB and ART contain internal stitch data in parts of their file for the embroidery machines to jump-through and read.
 
=== Digitizing Embroidery Formats ===
 
{| class="wikitable"
|+ Embroidery file formats
! extension !! Manufacturer(s) !!  Structure
|-
|-
| SEW || Janome, Elna, Kenmore|| most programs || stitch
| .art || Bernina || Compound File Binary Format, of a series of files. Different files within the format contain the summary information, the Design Information, contents (the compressed stitch data, zlib 4 bytes in), the Design Icon, a bitmap of the what the design should look like.
|-
|-
| SHV || Husqvarna Viking|| brand-specific || stitch
| .emb  || Wilcom || vectors, icon, colors, stitch. this is a full fledged vector format stored very similarly to .ART and clearly share a code base. Several elements stored via a byte replacement cypher in a zlib compressed stream of data after a file size.
|-
|-
| VIP || Pfaff (older), Husqvarna|| brand-specific || stitch
| [[Embroidery format PES|.pes]]  || Brother, Babylock || [[Embroidery format PES|(see detailed article)]], vectors, colors, (PEC File), contains several different layers of information.
|-
|-
[[Embroidery format VP3|VP3]] || Pfaff (newer)|| brand-specific || stitch, color
| .thr || ThreadWorks || full vector embroidery format for threadworks software.
|-
|-
| XXX || Singer, Compucon|| brand-specific || stitch
|}
|}
== Misc Other Embroidery Data ==
* .INF, contains only color information like a thread chart.
Embroidery files are used both for stitching and editing. They need to be read by the machine doing the embroidery to process the series of commands. Since most machine embroidery is rendered from shapes and fills applied to those shapes, saving only data needed to stitch would be lossy. So many formats have a hybrid of this and store easy to read stitch data and higher level objects sometimes protected with encryption and compressed (.hus, .art, .emb). With the higher level the embroidery program can reproduce the lower level stitch commands. For most programs that read this data, they often have their own higher level objects and can read only the stitch data from other formats. When they also write these formats, they very often produce the minimum acceptable version of the file that will not crash the program reading the file. So converting from Wilcom's emb to PES will produce a PES with only stitches even if the Wilcom had access to the higher level objects and the saved version of .pes also those forms available.


== Information that may be found ==
== Information that may be found ==
Line 192: Line 251:
## Scaling information.
## Scaling information.


Other lists:
== Other Lists ==
* [http://www.ggcreations.com.au/althea/formats.html Embroidery File Formats supported in Embird]
* [http://www.ggcreations.com.au/althea/formats.html Embroidery File Formats supported in Embird]
* [http://www.embroideryarts.com/resource/files/faq/formats_supported.php Formats Supported] at embroideryarts.com
* [http://www.embroideryarts.com/resource/files/faq/formats_supported.php Formats Supported] at embroideryarts.com

Latest revision as of 07:53, 15 August 2019

Machine embroidery
Module - entry page
Embroidery format
to finalize beginner
2019/08/15
See also

Introduction

There are several kinds of Embroidery file formats, and each contains different abilities and features. Some formats have different versions with increased features. Usually these try to preserve backwards compatibility because of the cost of the hardware in question.

This page attempts to provide a short overview on embroidery formats. For technical details, see the specialized articles (menu to the right). Any technical details in the article are for understanding why the formats are as they are.


The Intended Functions

A lot of embroidery confusion is because there are diverse sets of hardware, and software and it is an unfathomable muddle. We can however properly classify the problem. The most significant differences between embroidery files is the divide between intending to run a machine, and those intended to run the programs that create the files. It is like the difference between a complex layered photoshop file with undo history etc, and a jpeg. While you could call them both "image files" it's a massive distinction. We will divide them up to avoid this confusion.

Machine Embroidery Files

Often the actual controllers within the machines are similar and the encoding scheme for the embroidery files turns out to be quite similar. They are intended to control a stepper or servo motor, and a x-y plotter and a needle head. They don't do much beyond that. The stitch portions of files encode for three things: a control command, a dx, and dy. This encoding is also heavily influenced by the actual physical requirement of the machine. The machines must stitch, unless the needle bar is blocked, which is the difference between a stitch and a jump. When enough of these are in a row, machines will often force a trim. It's possible with some machines to block the needle bar and move very far from the last stitch, others will force a trim. There are some cases like fringe where this is needed or puffy applique. Given this requirement, the commands used within the machine code is often similar, sometimes identical when you get away from the header and into the actual stitches within the file.

The header information is often of a specific length and contains a lot of display data for the machines. These are quite often specifically intended for exactly that display module. But, it also means that some formats like .DST, .DSB, .DSZ all have the same header information even though the stitch data is entirely different. These different encoding schemes result in entirely new file types while not doing anything new.

To help matters, or rather make them worse, some companies like Wilcom will make new file types that just contain the stitch data. For example .t01 is just .dst encoded stitches without a header. So is .tap. Literally these formats are identical and interchangeable, and you could create them by deleting the first 512 bytes from a dst file. Wilcom also has .t?? with other numbers that refer to other encodings.

Different machines will have their own format specifically because the format consists of commands intended only for the particular machine in question. These are specific ways that that particular hardware interacts. This is akin to having a different filetype for each executable for each different type of processor. Which is entirely what these files are intended as, executables.

Interchange Formats

One consequence of the need to have simple machine readable formats is that when you cut away the headers, the formats can be used as interchange formats. Embroidery machines do only a few things and those need to be easily encode in ways that the hardware can understand and execute. This means these formats are pretty easy to parse in binary and consequently access the stitch data therein. The ubiquity of .dst files is largely due to the age of the format but also because many formats are derived from them, and the early hardware was from the company that produced them.

Machine Encoding

The encoding of the format for the computer varies from company to company and software maker to software maker, but often share a common history. For example Pulse developed the early software for Brother called PG1, it's also telling that the PEC blocks within Brother software just like in .dst files have a 512 block header starting with a label "LA: (NAME)" In 1982, Pulse Microsystems also developed the first PC software Stitchworks. The structure of the files is likely due to Pulse as well as Tajima's dominance in the industry in the 1990s.

Command Encoding Schemes

Code Length Company Associated Structure bytes Encoding
3 Pulse, Tajima, Eltac 3 byte swizzled bits, punchcard yyyyxxxx yyyyxxxx ccyyxx11 Triplet code has control bits on the final byte. Last 2 bits of that byte are always set.
2 (4) Melco signed(x,y), x=0x80 control xxxxxxxx yyyyyyyy Series of move locations with control bits being triggered by 0x80 (-127). 0x01 stop, 0x04 jump, 0x10 end)
3 Barudan unsigned(x,y) control byte xxxxxxxx yyyyyyyy 1cccccccc The control byte provides direction(x,y) aka the sign.Last bit controls jump vs. stitch.
3 Foxtron control byte, unsigned(y,x) 1cccccccc yyyyyyyy xxxxxxxx This is very similar to the previous but differs in the endness of the triplet code.
2 (3?, 4) Brother signed 7 bit (x,y) cxxxxxxx cyyyyyyy if the high bit is set, it becomes a 12 bit signed number. And the top 4 bits in that nibble (the first triggering the event), are control flags.
2 (4) Compucon, Singer signed(x,y), x == 0x7F or 0x7D xxxxxxxx yyyyyyyy The 0x7F triggers a control event, whereas 0x7D triggers long form. Read x and y as 2 sInt16.
2 KSM unsigned(x,y) control byte xxxxxxxx yyyyyyyy 1cccccccc Controls the restart of stitching after a stop. Before restart of stitching, commands are unstitched but identical to stitches.

The DST (Pulse, Tajima, Eltac) punchcard swizzle is the oddest and oldest of these. I cannot help imagine the hardware of the reader. The high bits control Y and low-bits control X. One could imaging different mechanism reading different parts of the card and controlling the x or y part of the plotter. Reading from bottom to top, you would get control bits, and the biggest movement, depending on whether there's a hole there and in the direction of the hole. With the next byte having a third of that impact, and a third more each byte you go up. It would also explain things like why in the third byte it always ends 11 ??????11. Namely, if you turned the physical card around it would be in the position of the control bits. 00000011 is the standard no-op card for that but 11000000 would cause c1 and c2 to set and therefore STOP. It might be nice to look at some precomputered punch-card of this encoding. But, it would detect a backwards card as stop, with those set bits. Welcome to the 21st century!

DST Encoding
BYTE 7 6 5 4 3 2 1 0
1 y+1 y-1 y+9 y-9 x-9 x+9 x-1 x+1
2 y+3 y-3 y+27 y-27 x-27 x+27 x-3 x+3
3 c0 c1 y+81 y-81 x-81 x+81 set set

Machine Embroidery Files

Embroidery file formats
.ext Manufacturer(s) Structure
.10o Toyota unsigned(x,y,control) encoded stitch, a separate .00o contain color
.100 Toyota 4 byte encoding. 2 bytes of control bytes, x,y with 3rd, 4th signed x, y locations.
.bro Bits & Volts 256 byte header. x == 0x80 control encoding, 2 stop, 3 jump
.csd Singer, POEM brand-specific
.dat Barudan 256 byte header, unsigned(x,y,control)
.dsb Data Stitch Barudan 512 byte DST header but stitches are unsigned(x,y,control) style
.dst Data Stitch Tajima (See detailed article) 512 byte header, DST encoded direct commands.
.dsz Data Stitch ZSK_USA 512 byte DST header, big-endian unsigned(x,y,control) style encoding, with specified 4 bit needle
.emd Elna 48 (0x30) byte header, x == 0x80 controlled encoding: 0x2A stop/color_change, 0x80 Jump, end 0xFD.
.exp Melco, Bernina (high-end models) Expanded Melco (See detailed article) X == 0x80 controlled encoding. Color change, 0x01 is followed directly by 0x00 0x00 and a jump 0x80 0x04 uses the following the X, Y position instructed, but must be repeated each new command.
.e?? Eltac 256 bytes of header, Triplet coded in DST encoding.
.f?? Forton 256 byte header, unsigned(x,y,control) style encoding
.gt Golden Thread 512 bytes of header, unsigned(x,y,control) style encoding
.hus Husqvarna Viking Compressed bytes. Using a small table arj compression. The belief is that it's from a defunct compression library ArchiveLib by a defunct company GreenLeaf Software using mode "AL_GREENLEAF_LEVEL_4" which it licensed from Robert Jung who wrote ARJ. Consequently the compression shares many of the same attributes.
.inb Inbro 8192 byte header, unsigned(x,y,control) style encoding
.jef Janome (See specific article) stitch + color, Header information, magic-number thread lookup, and 0x80 triggered control events.
.ksm Pfaff 512 byte header. unsigned(x,y,control) triplets. However, it doesn't force that encoding on all jumps. Rather after triggering a color change to a specific needle, it simply gives normal encoded x, y location (indistinguishable from stitches) until control bits of 0x17, 0x18, or 0x19 triggers sewing again.
.max Pfaff
.mit Mitsubishi
.new Ameco
.ofm Melco Compound Binary format with some stitches in there.
.pcd Pfaff these actually have a weird encoding scheme. Using absolute positioning rather than relative positioning.
.pcm Pfaff stitch
.pcq Pfaff stitch
.pcs Pfaff stitch
.pec Brother, Babylock (see detailed article) colors, stitch, 1 bit graphics, header of #PEC0001, pec block, magic number colors, graphics which are displayed on the machines, contain high-bit long-form + control triggering.
.pen Brother - Disney Graphics files and Encrypted Stitches. Apparently this was a thing.
.phb, Brother Header, bunch of info and a PEC block.
.phc, Brother Header, bunch of info and a PEC block.
.sew Janome, Elna, Kenmore magic number thread lookups. Signed x, y with 0x80 triggered control events.
.shv Husqvarna Viking stitch, Big old giant 1 bit graphic, of varying size. Magic number colors, x==0x80 controlled events. Predefined length for stitching events before color switching.
.sst Sunstar
.t01 Wilcom contains DST stitches with no header.
.t09 Wilcom contain Pfaff data.
.tap Happy contains DST stitches with no header.
.vip Pfaff (older), Husqvarna stitch, Compressed stitches. Like Hus it uses arj-like compression.
.vp3 Pfaff (newer) (See detailed article) unlike most encoding schemes Pfaff files are stored in designs, blocks, stitch blocks allowing multiple blocks and designs within the file structure (see article on specifics) stitches are encoded in several places with seek values to the next relevant set of data and x==0x80 triggered encoding with few commands, namely 0x02 which means long form (it may just mean trim, and jump, as Pfaff's software seems to do that) and 0x01 which ends long form. And 0x03 that is seen at the end of the final block.
.u?? Barudan (See detailed article) Barudan calls this FDR. 512 byte header. big-endian unsigned(x,y,control) encoded stitches in control.
.xxx Singer, Compucon stitch
.zsk ZSK USA

The x?? usually refers to file types with progressing numbers so .u00 .u01, etc, where it starts with u but then has some numbers that change by version.

Other Related Formats

Related file formats
.ext Manufacturer(s) Structure
.pmv Brother (see detailed article) Stitch format, 5 and 6 bit hybrid relative formatting. 5 bits of absolute location in presser foot, 6 bits of forward and back along sew path.

High Level Embroidery Files

At the other extreme there are vector files that serve to create embroidery files by having all by storing all the data needed to create an embroidery. This means having the vector shapes and fill types, and offsets and motifs, and which order these are located the start and stop location. And then generating a bunch of line segments through a variety of algorithms. These are then set into the machine readable formats for the machines to read and follow the set commands.

Hybrid Machine/Vector Files

Many of these types have both high and low level to run on machines. This is because they can encode this data without ill effect. For example PES contain a pointer to a PEC block that is entirely intended for brother embroidery machines to use. These blocks also exist in PEC files that simply say #PEC0001 and then have the PEC block, or within PES files or PHB and PHC files which also contain a PEC block. EMB and ART contain internal stitch data in parts of their file for the embroidery machines to jump-through and read.

Digitizing Embroidery Formats

Embroidery file formats
extension Manufacturer(s) Structure
.art Bernina Compound File Binary Format, of a series of files. Different files within the format contain the summary information, the Design Information, contents (the compressed stitch data, zlib 4 bytes in), the Design Icon, a bitmap of the what the design should look like.
.emb Wilcom vectors, icon, colors, stitch. this is a full fledged vector format stored very similarly to .ART and clearly share a code base. Several elements stored via a byte replacement cypher in a zlib compressed stream of data after a file size.
.pes Brother, Babylock (see detailed article), vectors, colors, (PEC File), contains several different layers of information.
.thr ThreadWorks full vector embroidery format for threadworks software.

Misc Other Embroidery Data

  • .INF, contains only color information like a thread chart.

Embroidery files are used both for stitching and editing. They need to be read by the machine doing the embroidery to process the series of commands. Since most machine embroidery is rendered from shapes and fills applied to those shapes, saving only data needed to stitch would be lossy. So many formats have a hybrid of this and store easy to read stitch data and higher level objects sometimes protected with encryption and compressed (.hus, .art, .emb). With the higher level the embroidery program can reproduce the lower level stitch commands. For most programs that read this data, they often have their own higher level objects and can read only the stitch data from other formats. When they also write these formats, they very often produce the minimum acceptable version of the file that will not crash the program reading the file. So converting from Wilcom's emb to PES will produce a PES with only stitches even if the Wilcom had access to the higher level objects and the saved version of .pes also those forms available.

Information that may be found

  1. Stitch Information.
    1. Direct commands go dx/dy, add stitch, go dx/dy, trim, change threads, stop.
    2. Explicit location of the points for the segment list.
    3. Stitchblocks unbroken lists of stitches in a particular color.
  2. Vector Information
    1. Shape Data, Rectangle, Circle, Path etc.
    2. How these shapes should be filled. For example:
      1. Type of fill being used
      2. Angle of the fill
      3. Angle-path of the fill
      4. Start and stop location within the shape.
      5. Pattern for the needle impacts.
      6. Randomization of edge.
  3. Font Information
    1. Text and font, how it should be applied.
  4. Design information.
    1. Design name.
    2. Design author.
    3. Design comments.
    4. Design keywords.
    5. Design copyright.
    6. Design category.
    7. Number of Stitches.
    8. Number of jumps.
    9. Size of embroidery.
    10. Start Location.
  5. Hoop Information.
    1. Specific custom hoop information.
    2. Distance design is from edge of hoop.
  6. Thread Information.
    1. Color data from a preselected list.
    2. Custom color data for thread.
    3. Thread metadata, manufacturer, pantone approximate, etc.
    4. Thread weight
  7. 2D Bitmap information, simulated view of the sewout.
    1. Bitmap representation for project. EMB contain a full color icon.
    2. Bitmap representation for each color. PEC contains 1 bit graphics.
  8. Control information for the typical editor of that format.
    1. Color of background.
    2. Scaling information.

Other Lists

Acknowledgements

Due to reorganization - i.e. the breakup of the Computerized embroidery page - names of original contributors, in particular Tatarize, do not appear in the history of this page.