About
Community
Bad Ideas
Drugs
Ego
Erotica
Fringe
Society
Technology
Hack
Phreak
Broadcast Technology
Computer Technology
Cryptography
Science & Technology
Space, Astronomy, NASA
Telecommunications
The Internet: Technology of Freedom
Viruses
register | bbs | search | rss | faq | about
meet up | add to del.icio.us | digg it

Hungarian Naming Convention

by Jerry Huston

Hungarian Naming Convention can be used for variable or identifier naming in languages such as C, C++, Pascal, Fortran etc. It was first used commercially at Microsoft from Charles Simonyie PHD thesis. Its use makes an identifier both state its type and imply its usage in the program. It makes code much easier to follow for the original programmer at a later date, and easier for more than one person to work with the code at any time.

SFÝÿCould you give a few more examples of Hungarian notation?

From Jerry Huston

Sure. I'll show you some examples of official card-carrying Hungarian notation, and examples of my usual interpretation of it. (I don't want to start any message wars over what's REAL Hungarian... I know the difference, but prefer my own variation.)

Official:

The standard types are...

f    a boolean flag.
ch   a one-byte character.
st   pascal-type string (also sometimes sc).
sz   zero-terminated string.
fn   function.

The standard generic types are...

w a word, usually 16 bits. b a byte, usually 8 bits. l a long, usually 32 bits. u an unsigned word, usually 16 bits. bit a single bit. v a void. Usually used with the prefix p, as pointer to void is a meaningful variable type, whereas void isn't.

Some special types...

env an environment. sb segment base. ib an offset (combination of index prefix and byte type).

The standard prefixes are...

p a pointer. Not a type itself, but an operation applied to a type. pch would be a pointer to a character. lp a 32-bit far pointer. hp a huge pointer. np a near pointer rg an array (considering it a group that's the range of a mathematical function). i index into an array. c count, such as cch, the first byte in an st. d difference (between two instances of a type). h handle, often a pointer to a pointer. hh huge handle, a huge pointer to a 16-bit pointer within the same segment pointed to by the huge pointer. gr group, or pointer to it. A group is different from an array, in that a group may contain different-sized objects. Could be a linked-list. b an offset, typically used in conjunction with a gr, since an index (i) would be inappropriate for anything but an array. mp an array. An abbreviation for map, since an array is a mapping of the index to the value stored at that place. dn domain. Used in the rare case when the important part of the array mapping is the index, not the contents. e element of an array. f a bit within a type. Typically used to store one or more bit flags in unused positions within an integer. sh a shift amount. Specifies the location of a bit within a type by the bit number, rather than the bit mask that the f specifies. u a union. a allocation. Distinguishes between an array and a pointer to it. sz would be a string, asz would be the allocated space that it's stored in. v a global.

Some Examples...

pch pointer to a character. ich index into an array of characters. rgst an array of pascal-type strings. bst offset to a particular pascal-type string in a grst. phpx a near pointer to a huge pointer to an object of type x. pich a near pointer to an index into a character array. en a base type, such as an entry. hrgn handle to a region. dx length of a horizontal line (difference between two x's. mpmipfn an array of pointers to functions, indexed by mi's, where mi might be a menu item. rgrgx two dimensional array of x's (array of arrays). pv pointer to void (such as an argument to free()). hrgch huge pointer to an array of characters.

Fortunately, my own programs and functions are seldom large enough, with enough different variables, for me to need such an elaborate scheme to keep track of identifiers. I haven't worked with the Gospel according to Hungarian enough to look at something like pgrxchDomPev and be able to say, "Of course, obvously that's a ..."

So I use a modified, and quite simplified form, that uses some of the basic tenets of Hungarian notation. Since I work entirely in C or C++, and write programs only for PCs these days, I use abbreviations of the C data types rather than the standard Hungarian type designations.

For example, lowercase i to indicate an integer, not w for word. I use lowercase c to indicate a character, not b for byte. Some of the official Hungarian designations do fit my needs, such as u for unsigned, and sz for zero terminated string.

I tend to spell out things that Hungarian would indicate with a one-character modifier. If I were working in a function with two pointers to chars (for example converting a Pascal-type string to a C-type string) I might call them pcString1 and pcString2, and use an offset called iOffset (which would allow converting strings that used one or more bytes to store the length of the Pascal string). Thus, the assignment in that function might look like,

*pcCString++ = *(pcPasString++ + iOffset);

I think the most important thing is to standardize on something that makes sense. Your personal method, or your departmental or company method may not match the one used in a given apps group at Microsoft, but it can still be better than an undisciplined approach that doesn't imply at all what the variable is.

A student one time told me about a huge maintenance project that he inherited when he took a job that had been vacated on short notice. The woman who had written the application in the first place had used the names of all her kids, then her nieces and nephews for variable and function names. When she needed more, she used names of current and prevous pets. Compared to that, even my loose interpretation of Hungarian is quite structured!

By the way, the name comes from Charles Simonyie, who works for Microsoft. He wrote a thesis on programmer productivity, a small part of which was his variable naming convention. Friends first called it "Reverse Hungarian Notation," as a play on the fact that Charles is of Hungarian descent, and of course on the famous RPN notation of HP's. Eventually that just got shortened to "Hungarian Convention" or even just "Hungarian."

Some of the developer groups at Microsoft use a much purer form of Hungarian notation, with prescribed types and modifiers much more strictly enforced. (...and some there don't use it much at all.)

In Charles' original thesis, he felt it better to use variable names that don't infer the data type names used in a particular language, and I don't quite go along with that for my own use. For example, instead of the prefixes b and w to indicate byte and word, I much prefer using c and i to imply char and int. That's perhaps more natural for me, because I do nearly all my work in C++ and C.

Also, I tend to be a bit more descriptive about how a particular datum fits into the function itself, rather than how it relates to other data -- whether it's a range or a domain, for example.

But the basic *idea* of Hungarian is a great one... that of making an identifier both state its type and imply its usage in the program. It makes code much easier to follow for the original programmer at a later date, and easier for more than one person to work with the code at any time.

 
To the best of our knowledge, the text on this page may be freely reproduced and distributed.
If you have any questions about this, please check out our Copyright Policy.

 

totse.com certificate signatures
 
 
About | Advertise | Bad Ideas | Community | Contact Us | Copyright Policy | Drugs | Ego | Erotica
FAQ | Fringe | Link to totse.com | Search | Society | Submissions | Technology
Hot Topics
Split Hard Drive???
computer crashed
Intel's Q6600
Unlock My Phone
opening a .iso file without writing it?
Closed Captioning Decoders
sharing broadband
where is most of my disk space being taken up?
 
Sponsored Links
 
Ads presented by the
AdBrite Ad Network

 

TSHIRT HELL T-SHIRTS