<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>#AltDevBlogADay &#187; Andy Firth</title>
	<atom:link href="http://www.altdevblogaday.com/author/andy-firth/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.altdevblogaday.com</link>
	<description>Each day a little more #gamedev love</description>
	<lastBuildDate>Thu, 17 May 2012 03:06:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Code Build Optimisation Part 4 &#8211; Incremental Linking and the Search for the Holy Grail.</title>
		<link>http://www.altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/</link>
		<comments>http://www.altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/#comments</comments>
		<pubDate>Mon, 26 Dec 2011 23:00:55 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[code optimization]]></category>
		<category><![CDATA[incremental linking]]></category>
		<category><![CDATA[lib]]></category>
		<category><![CDATA[library file]]></category>
		<category><![CDATA[link time]]></category>
		<category><![CDATA[Microsoft Visual C++]]></category>
		<category><![CDATA[MSVC]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Visual C++]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=19768</guid>
		<description><![CDATA[<p>So you&#8217;ve done your full rebuild, waited out your now much shorter build time while talking with colleagues over coffee&#8230; but wait&#8230;you forgot to make that 1 edit to test your function&#8230; there done&#8230;you see your file building&#8230; its done and BAM  <em><strong>&#8220;Linking&#8230;&#8221; </strong></em>time to go get more coffee&#8230; this baby is going to take a while.</p>
<p><a href="http://www.altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/" class="more-link">Read more on Code Build Optimisation Part 4 &#8211; Incremental Linking and the Search for the Holy Grail&#8230;.</a></p>
]]></description>
			<content:encoded><![CDATA[<p>So you&#8217;ve done your full rebuild, waited out your now much shorter build time while talking with colleagues over coffee&#8230; but wait&#8230;you forgot to make that 1 edit to test your function&#8230; there done&#8230;you see your file building&#8230; its done and BAM  <em><strong>&#8220;Linking&#8230;&#8221; </strong></em>time to go get more coffee&#8230; this baby is going to take a while.</p>
<p>Typically on small programs linking is measured in mere seconds (often less) and is largely unnoticeable. However on larger projects with millions of lines of code, heavy template usage and optimizations such as inline functions and weak references (declspec selectany) your symbol count goes off the charts. Combo this with the probable use of libraries to compartmentalize your code and the likelihood that you&#8217;re compiling more than 1 target at a time (dll&#8217;s, tools, runtime). The linker is likely doing a significant amount of the work for you AFTER compiling your single file. As an example for me within our current code @ Bungie a single file change in the code I usually work on can cost as much as 10mins in linking (many targets). If i opt to build a single target (say the runtime) then this is reduced to ~2 mins. This is without any linker optimization such as Link Time Code Generation which is known to heavily increase Link time.</p>
<p>In my experience I&#8217;d say most of my changes (and most of the changes my colleagues make) before doing a compile involve a single file. More often than not its simpler than that: a single line. So the iteration is something like</p>
<ul>
<li>make change</li>
<li>hit compile</li>
<ul>
<li>1-2 seconds of compile</li>
<li>1-2 mins to link single runtime target.</li>
</ul>
<li>load game</li>
<li>test</li>
<li>repeat</li>
</ul>
<div>Now Loading the game is always going to be a problem*, but but the build itself is completely dominated by linking. If only that step could be faster.</div>
<h2>Incremental Linking</h2>
<p>Incremental linking sets up an <strong>ilk</strong> file. The first time you link with /INCREMENTAL enabled  it will generate a database of the symbols within the Target executable and the location that supplied them within object files of said executable. This file is used in subsequent links to cross reference a changed object file (the single file you changed) with final executable allowing the linker to &#8220;patch&#8221; the symbols that are different with the new version. The Patch is achieved using a code thunk so the resultant code can be slower. The recommendation is to do a full rebuild before doing any performance testing.</p>
<p>Incremental linking is on for link.exe by default. New Projects generated with MSVC will explicitly disable it for Release however. I would assume that most if not all of your simple programs are using it in debug. Sadly the most common &#8220;working&#8221; case is the one that needs it least; small programs have short link times. In my experience the more complex your project the less likely incremental linking is working for you. I have spoken with many engineers over the years about incremental linking and almost all believe it to be a flawed feature of MSVC. Sometimes it works, sometimes it doesn&#8217;t. When it doesn&#8217;t work it actually increases link times and therefore, for many engineers it isn&#8217;t worth wrangling with.</p>
<p>I experienced the same, sometimes it works and its fantastic. Sometimes it simply doesn&#8217;t&#8230; most of the time it tells you a reason.</p>
<p>If your project uses any linker options that alter the symbols the linker uses or explicitly lists the order in which those symbols are used then internally the linker will disable incremental linking (the MSDN docs explicitly mention this).</p>
<ol>
<ul>
<li>/ORDER (forcing comdat order): <a href="http://msdn.microsoft.com/en-us/library/00kh39zz(v=vs.71).aspx">http://msdn.microsoft.com/en-us/library/00kh39zz(v=vs.71).aspx</a></li>
<li>/OPT:REF (dead stripping): <a href="http://msdn.microsoft.com/en-us/library/bxwfs976(v=vs.71).aspx">http://msdn.microsoft.com/en-us/library/bxwfs976(v=vs.71).aspx</a></li>
<li>/OPT:ICF (comdat folding): <a href="http://msdn.microsoft.com/en-us/library/bxwfs976(v=vs.71).aspx">http://msdn.microsoft.com/en-us/library/bxwfs976(v=vs.71).aspx</a></li>
</ul>
</ol>
<div>Using /sourcemap (undocumented method of redirecting pdb source lookup) will automatically disable incremental linking. I don&#8217;t yet understand the reasons for this however i know it disables it. Its undocumented but several teams i know are using it so figured i would mention it here.</div>
<div>Barring the above, which are relatively easy to avoid explicitly within your linker options. I still had trouble getting incremental linking working.  MSDN explains that a full link occurs when:</div>
<div>
<ul>
<li>The incremental status (.ilk) file is missing. (LINK creates a new .ilk file in preparation for subsequent incremental linking.</li>
<li>There is no write permission for the .ilk file. (LINK ignores the .ilk file and links nonincrementally.)</li>
<li>The .exe or .dll output file is missing.</li>
<li>The timestamp of the .ilk, .exe, or .dll is changed.</li>
<li>A LINK option is changed. Most LINK options, when changed between builds, cause a full link.</li>
<li>An object (.obj) file is added or omitted.</li>
<li>An object that was compiled with the /Yu /Z7 option is changed.</li>
</ul>
</div>
<div>However non of these were occuring (that i could tell).</div>
<div>After a few experiments and a lot of compiling a question came to mind that i hadn&#8217;t considered before.</div>
<h3>What is a (.lib) library file.</h3>
<div>The <a href="http://msdn.microsoft.com/en-us/library/ba1z7822(v=VS.100).aspx">MSDN definition</a> is somewhat verbose but it boils down to: <strong>a library is a container for a set of objects. Link resolves external references by searching first in libraries specified then default libs.</strong></div>
<div>My own understanding of libraries was woefully lacking. I believed it to be a container for objects only, when you linked against a library it linked in all the object files. The reality is very different however. If you link with the commandline option <a href="http://msdn.microsoft.com/en-us/library/wdsk6as6(v=VS.80).aspx">/VERBOSE</a> you can see for yourself what actually occurs. The rough algorithm is:</div>
<div>
<ul>
<li>Compile Target Source files (main.cpp etc)</li>
<li>When linking:</li>
<ul>
<li>For each symbol not found in Target Source Files</li>
<ul>
<li>Search objects in supplied libs in the order supplied</li>
<li>When symbol found stop searching</li>
</ul>
<li>end For</li>
</ul>
</ul>
<div>This essentially means that when linking an executable, Link.exe cherry picks the symbols it needs from a library and discards the rest. The upshot of this is effectively object file level dead-stripping as a first level feature of using a lib file. Now lets get back to incremental linking.</div>
</div>
<h3>What does that mean for Incremental Linking with Lib Files</h3>
<p>If you&#8217;re simply linking against stable Libs, incremental linking works fine. However if the code you&#8217;re editing is within a Lib itself, it doesn&#8217;t work.  There seems to be no output from link.exe saying its not going to incrementally link it simply doesn&#8217;t do it. You can see this for yourself by using the /time+ (another undocumented option) option on the linker command line which describes the various passes the linker does. Experiment yourself with changing a file within a lib and a file within the target exe.</p>
<p>I believe this all stems from a simple fact: Libs are supposed to be stable from your builds perspective.</p>
<p>I have always used them for compartmentalization; a simple method to contain functionality within a single file (and therefore reference) rather than having each project have to contain a multitude of files it doesn&#8217;t really need to know about. I believe this is a very common usage pattern. Common enough that every team i&#8217;ve worked with used them in this way. I believe this is the main reason many engineers believe incremental linking is simply a broken feature.</p>
<h3>But i want my cake</h3>
<p>Given the information above the solution &#8220;seems&#8221; simple. Simple enough that VS now provides an option for it based on work by Joshua Jensen who discusses it at length <a href="http://www.workspacewhiz.com/FastSolutionBuild/FastSolutionBuildReadme.html">here</a>.</p>
<p>Within a standard VS2010 project browse to:<strong></strong></p>
<p><strong>&lt;MyProject&gt;/Properties/Configuration Properties/Linker/General/Use Library Dependency Inputs</strong></p>
<p>This option switches a &#8220;library&#8221; included from within your solution, to use the object files directly rather than the library file. This switches the library to my own previous understanding of lib files; all objects will be linked into the target explicitly as if they were Target source files. The lib file itself isn&#8217;t used at all.</p>
<p>One issue however, I mentioned earlier using a lib provides a feature i hadn&#8217;t previously understood <em>&#8220;effectively object file level dead-stripping&#8221;. </em>If your lib file contains symbols that have a dependency themselves, for instance D3d (very common) then using the above option will require you to explicitly list D3D as a dependency in your Target project setup regardless of whether the Target required the functionality itself. This could mean relatively few changes for you or it could mean you have to split up your code to remove dependencies that only exist for specific Target executables but not for all.</p>
<p>Deadstripping (OPT:REF) has to be off so this switch will in turn enlarge your Target executables.</p>
<h2>Winning</h2>
<p>Once Incremental Linking is active and working, you&#8217;ve removed the use of Lib files directly for the code you&#8217;re editing and stopped doing deadstripping for all local builds you&#8217;ll see a massive improvement in overall iteration times. For one of our tools projects the link time reduced from ~5mins to ~10seconds and now we&#8217;re aware of the pitfalls &amp; reasons for linker performance (using /VERBOSE /time+) we&#8217;re reducing that even further. For our main runtime the link reduced from 2mins to 2-3 seconds, a single file change including deployment was reduced from 145seconds to ~10 seconds. This seems inconsequential when considered standalone however for engineers iterating on functionality within said runtime this is huge and equates to an estimated 30-45 mins per day per engineer.</p>
<h2>*Only the penitent man shall cross</h2>
<p>Once you have resolved all the issues that can stop incremental linking from working you are a short step from the Holy Grail. Something I personally have never had working before but will be actively fighting for in the near fugure. Consider the original process reformatted based on our above work:</p>
<ul>
<li>make change</li>
<li>hit compile</li>
<ul>
<li>1-2 seconds of compile</li>
<li>1-2 <em><strong>seconds</strong></em> to link runtime target.   &lt;= incremental linking is great</li>
</ul>
<li>load game</li>
<li>test</li>
</ul>
<div>In this new iteration process, the initialization &amp; loading of the data the  Target uses is highly likely the main time sink. For a game this usually means loading a level, instancing 1000&#8242;s of game entities and starting all manner of systems. I&#8217;ve seen this be as fast as 30 seconds or as slow as 5-10mins depending on Build Target (Debug vs slightly optimized vs Release) and the data being tested.</div>
<div><a href="http://msdn.microsoft.com/en-US/library/bcew296c.aspx">Edit-and-continue</a> is a time saving feature that allows the code to compile while in break mode, when the programmer steps or continues the &#8220;new&#8221; code is compiled using incremental linking in-situ. It is non-trivial to setup and has all manner of caveats however depending on your specific circumstances the win can be huge. Were Edit-and-continue working the above example becomes:</div>
<div>
<ul>
<li>Once:</li>
<ul>
<li>hit compile &amp; run</li>
<ul>
<li>1-2 seconds of compile</li>
<li>1-2 <em><strong>seconds</strong></em> to link runtime target.</li>
</ul>
<li>load game (30 seconds to 5mins)</li>
</ul>
</ul>
<ul>
<li>Repeat</li>
<ul>
<li>run test within game</li>
<li>break into program</li>
<li>make small change</li>
<li>continue running newly compiled program</li>
</ul>
</ul>
</div>
<div>For small changes this can save a programmer the entire cost of loading the data &amp; initializing the systems being tested. For my current situation were I to get Edit and continue working I&#8217;d save around 3 minutes per iteration.</div>
<div>For me, being able to iterate on a test situation as fast as possible is paramount. Hopefully the above helps you reach your development goals faster and with a lot less coffee time.</div>
<h2>Notes &amp; References</h2>
<p>Managed C++ cannot link incrementally, if you&#8217;re combing managed &amp; native consider splitting the native into a dll to achieve incremental linking. Obviously this is a lot more work and would likely deserve a cost benefit analysis beforehand.</p>
<p>MSDN Incremental Linking Help: <a href="http://msdn.microsoft.com/en-us/library/4khtbfyf(v=VS.100).aspx">http://msdn.microsoft.com/en-us/library/4khtbfyf(v=VS.100).aspx</a></p>
<p>Edit and Continue: <a href="http://msdn.microsoft.com/en-us/library/esaeyddf.aspx">http://msdn.microsoft.com/en-us/library/esaeyddf.aspx</a></p>
<p>Forum post re: incremental Linking: <a href="http://bytes.com/topic/net/answers/281196-incremental-linking-multiple-projects">http://bytes.com/topic/net/answers/281196-incremental-linking-multiple-projects</a></p>
<p>Link Time Code Generation: <a href="http://msdn.microsoft.com/en-us/magazine/cc301698.aspx">http://msdn.microsoft.com/en-us/magazine/cc301698.aspx</a></p>
<h2>Series</h2>
<p><a href="http://altdevblogaday.com/2011/09/20/codebuild-optimisation1/" target="_blank">Part 1</a>  <a href="http://altdevblogaday.com/2011/11/04/code-build-optimization-part-2/" target="_blank">Part 2</a>  <a href="http://altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/" target="_blank">Part 3</a> <a href="http://altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/" target="_blank">Part 4</a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Code Build Optimisation Part 3</title>
		<link>http://www.altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/</link>
		<comments>http://www.altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/#comments</comments>
		<pubDate>Mon, 21 Nov 2011 09:09:18 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[#gamedev]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[badness]]></category>
		<category><![CDATA[build optimization]]></category>
		<category><![CDATA[Header Optimization]]></category>
		<category><![CDATA[heuristic]]></category>
		<category><![CDATA[include hierarchy]]></category>
		<category><![CDATA[VS2010]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=19765</guid>
		<description><![CDATA[<p>So we&#8217;ve all done it. You&#8217;re coding away and you want to call say, a math function from your templated utility class that lives in your header. So you #include its header in your header, everything works allowing you to move on. It&#8217;s a common pattern that many engineering projects are purely feature/result driven; write the code, debug the code, check in and you&#8217;re done&#8230; next task. Rinse and repeat this pattern for a few months/years across many engineers and you inevitably find that your include structure is heavy &#38; costly. But it all works right? Who cares that it takes 5x longer to compile now. If you care, read on.</p>
<p><a href="http://www.altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/" class="more-link">Read more on Code Build Optimisation Part 3&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>So we&#8217;ve all done it. You&#8217;re coding away and you want to call say, a math function from your templated utility class that lives in your header. So you #include its header in your header, everything works allowing you to move on. It&#8217;s a common pattern that many engineering projects are purely feature/result driven; write the code, debug the code, check in and you&#8217;re done&#8230; next task. Rinse and repeat this pattern for a few months/years across many engineers and you inevitably find that your include structure is heavy &amp; costly. But it all works right? Who cares that it takes 5x longer to compile now. If you care, read on.</p>
<p>Once you get to this stage it is often very difficult to figure out what is going on. Depending on the age of your code base you can often be looking at an include tree 4-5 levels deep (sometimes far worse) and involving 1000&#8242;s of files. How do you retroactively figure out where to direct your attention? Hopefully I can help you there.</p>
<p>I should note that the basic method here was written by one of Bungies Senior Engineers <a title="Jon Cable" href="http://www.bungie.net/News/content.aspx?cid=17075" target="_blank">Jon Cable</a> who kindly let me take over the code and gave permission to discuss it here.</p>
<p>The intent of the algorithm is simple, to rank all header files based on a heuristic involving various aspects of their use. Add to this the ability to inspect the entire include tree of a specific file and you have a powerful tool indeed.</p>
<p>The initial version was brute force. Call the function from your source directory root and it:</p>
<ol>
<li>hunted down all the cpp/h files and adds records for them.</li>
<li>It then recursively processed each cpp generating an include tree and counts per header file based the number of files they included (children) and the number of files that include them (parents).</li>
<li>Post processes the entire file list building counts for files included BY their children (Descendants) and number of headers that include THIS file (Ascendents).</li>
<li>Jon&#8217;s initial code used the heuristic</li>
<ol>
<ul>
<li>(ancestors * (descendants + 1) )</li>
</ul>
</ol>
</ol>
<p>This took a little while to run (4-5mins) but generated a top ten list of &#8220;bad&#8221; headers. Accompanying the top 10 were txt files allowing the user to search for a specific string in each txt DB and find the children, ascendants, parents and descendants thus enabling the optimization of that header. The information was good and it was used to optimize on several occasions during previous projects. It was however somewhat cumbersome to setup &amp; use and often interpretation of the data was difficult so in the end very few engineers used it.</p>
<p>We are inherently visual beasts and I have a penchant for visualization of complex problems. Text is easy to read but often it is very difficult to grok a pattern or understand a tree without visualizing it. I also wanted this tool to be useful as a regular optimization pass to all engineers. I had to make it simple &amp; fast.</p>
<p>Firstly I took the code and replaced the search for files with the dependency database the compiler already uses (we use a Jam Plus variant so this was trivial, I believe its possible using msbuild too tho), this also allowed for building of the include tree &amp; counts directly from said source. The result of this was the information required was available in 3-4 seconds (11,000 file project)</p>
<p>The second pass was to output the &#8220;bad&#8221; headers information in a graph form. I used some old code to output both a graphviz document &amp; a DGML document of the query i was doing. While doing this i also added the ability to simply request the header structure for an arbitrary file. A screenshot of a DGML example is shown here.</p>
<div id="attachment_20356" class="wp-caption alignnone" style="width: 240px"><a href="http://altdevblogaday.com/wp-content/uploads/2011/11/ExampleHeaderGraph.jpg"><img class="size-medium wp-image-20356" src="http://altdevblogaday.com/wp-content/uploads/2011/11/ExampleHeaderGraph-230x300.jpg" alt="" width="230" height="300" /></a><p class="wp-caption-text">An Example Analysis from Our Engine</p></div>
<p>Obviously the above is too small to see the filenames on purpose :)</p>
<p>The colorization in the above graph is intended to offer some context such as an SDK include, a windows header, an internal library etc. I prefer the DGML form because it is an active document allowing searching and hotlinking to code as well as graph analysis and reduction of focus. I output the graphviz form also due to DGML being a feature of VS2010 Ultimate only (~$10k per copy) so few engineers have it. The Graphviz form is a simple image and thus not AS powerful but still really useful.</p>
<p>Once this was up and running i wrote a set of VS Macros to provide direct access. From any file in the project an engineer is now able to single click and see badness rank and/or the include structure of their files.</p>
<p>Future work on this will involve augmenting the heuristic to include characteristics of the code within headers such as template usage or inline expansion both of which contribute significantly to compile times.</p>
<p>Hopefully others recreate the tool (it took only a few hours) and use it to help their projects &amp; compile times.</p>
<p>enjoy.</p>
<h2>Series</h2>
<p><a href="http://altdevblogaday.com/2011/09/20/codebuild-optimisation1/" target="_blank">Part 1</a>  <a href="http://altdevblogaday.com/2011/11/04/code-build-optimization-part-2/" target="_blank">Part 2</a>  <a href="http://altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/" target="_blank">Part 3</a> <a href="http://altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/" target="_blank">Part 4</a></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Code Build Optimization Part 2</title>
		<link>http://www.altdevblogaday.com/2011/11/04/code-build-optimization-part-2/</link>
		<comments>http://www.altdevblogaday.com/2011/11/04/code-build-optimization-part-2/#comments</comments>
		<pubDate>Fri, 04 Nov 2011 16:08:44 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[#gamedev]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Header Optimization]]></category>
		<category><![CDATA[Include Optimization]]></category>
		<category><![CDATA[VS2010 Macros]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=19763</guid>
		<description><![CDATA[<p>In Part 1 we figured out that File IO was a structural bottleneck in PDB generation and therefore ripe for&#160;optimization. Here i describe a continuation of the optimization process into reduction of code compilation cost and how i achieved some good results without manually trawling code.</p>
<p><a href="http://www.altdevblogaday.com/2011/11/04/code-build-optimization-part-2/" class="more-link">Read more on Code Build Optimization Part 2&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>In Part 1 we figured out that File IO was a structural bottleneck in PDB generation and therefore ripe for&nbsp;optimization. Here i describe a continuation of the optimization process into reduction of code compilation cost and how i achieved some good results without manually trawling code.</p>
<p>At this point I began looking at some of our code and playing around with a few <em>longer </em>duration cpp files to try and get a &#8220;feel&#8221; for what was taking so long. The compiler isn&#8217;t very useful in this regard so at this point we&#8217;re reliant on empirical data. One particular file didn&#8217;t seem to do very much (main hook for unit testing &#8220;core&#8221; systems) but was taking 9 seconds to compile so i began decimating it, stubbed out all its functions and it still took ~8 seconds to compile. Because this was a <em>core systems </em>test file it was including a lot of headers for <em>toolbox</em> like functionality: templated containers, specialized math classes, job framework. Many of those systems do not contain much in terms of&nbsp;state full&nbsp;code, they simply provide a framework for the client to include and template functionality into their systems. There&#8217;s that word again, <em>template</em>, we&#8217;ll come back to that in a later post.</p>
<p>&nbsp;</p>
<h1>I see Dead People</h1>
<p>Upon examining some of the includes for this single long build cpp file i realized that the development of the unit tests system was playing a part. When it was originally written the code to perform the actual system level testing resided in this cpp file. As the system developed and further modules added the file was split into a set of hooks into other cpp files and a framework added such that effectively the unit tests file literally only needed externs to those test blocks.</p>
<p>I pared them down, of the 50+ includes only around 10 were required. Compile time down to &lt;1second&#8230;<br />
So this innocuous file had 40+ includes it didn&#8217;t need, many of them heavy with inline functions, templates, all manner of other includes. The pre-processed file for it was huge (surprisingly so). Removing the headers was laborious and time intensive: remove header, compile all targets, if they all compile &amp; link fine then we&#8217;re good and the header can be removed permanently. This process took around 40 minutes for this single file and we had (at the time) ~7000 source files active&#8230; this would take far too long.<br />
But wait.. we have these machines, they do these things for us when we tell them to&#8230;</p>
<p>&nbsp;</p>
<h1>Do I know Kung-Fu?</h1>
<p>At this point it was obvious that a plugin or macro could do this work for us. What i found however was that the help for VS Macros with VS2010 was somewhat lacking, searching the internet often gets more results in Word/Excel than any other application and very little seems to exist in terms of automation beyond trivial operations such as text insertion. The basics of what i wanted to do is simple</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>for each file in the project
    for each include in the file
        if (file compiles without include)
            remove include
    next
next </code></pre>
<p>I spent a weekend playing around with various options and hunting down a lot of the automation options available with VBA / DTE. It did turn out that there is a ton of info but it is very tough to search for it. In the end i found that the most helpful method was to RECORD macros yourself and dissect them into useful snippets for your work.</p>
<p>Some helpers for those who wish to do this themselves.</p>
<h2>To build the startup project use</h2>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>Function BuildStartupProject_f() 

    Dim sb As SolutionBuild = DTE.Solution.SolutionBuild
    Dim projName As String = sb.StartupProjects(0)
    Dim Config As String = sb.ActiveConfiguration.Name
    Dim Platform As String = sb.ActiveConfiguration.PlatformName 

    DTE.ExecuteCommand(&quot;View.Output&quot;)
    sb.BuildProject(Config + &quot;&#124;&quot; + Platform, projName, True)
    BuildStartupProject_f = sb.LastBuildInfo.Equals(0) 

End Function
</code></pre>
<h2>setup a find target</h2>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>DTE.ExecuteCommand("Edit.Find")
Dim findResult As EnvDTE.vsFindResult
DTE.Find.FindWhat = "^:b*\#include"
DTE.Find.PatternSyntax = vsFindPatternSyntax.vsFindPatternSyntaxRegExpr
DTE.Find.MatchCase = False
DTE.Find.Target = vsFindTarget.vsFindTargetCurrentDocument
DTE.Find.MatchWholeWord = False
DTE.Find.Backwards = False
DTE.Find.MatchInHiddenText = False
DTE.Find.Action = vsFindAction.vsFindActionFind
CType(DTE.Find, EnvDTE80.Find2).WaitForFindToComplete = True
findResult = DTE.Find.Execute() </code></pre>
<h2>create an undo context</h2>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>DTE.UndoContext.Open("IncludeReplace")
' Do stuff here '
if you need to undo use DTE.UndoContext.SetAborted()
</code></pre>
<h2>to iterate over files in a project use</h2>
<p>http://msdn.microsoft.com/en-us/library/aa301042(v=vs.71).aspx</p>
<p>Apologies for not being able to supply a working set of macros but they fall under my employers contract so i can only provide snippets that are available online and guidelines. Feel free to ask any questions over email tho and i will answer what i can within the confines of said contract.</p>
<h1>The greatest harm can result from the best intentions</h1>
<p>The Brute force mechanism roughly described above does not come without its issues. I found that upon running it on some of our low level code I created a problem that can only be combated by looking through every change made. The main cause of this was cpp files where the entire system was compiled out (removed) based on a global define that was found IN the first include. This is a common pattern for inspection/debug systems and created 3-4 problems on my first run. To combat this i made the overall macro algorithm build all target executables. Without these inspection systems the final exe fails to link and our tests can continue. This does however send our iteration time down into the 20mins per file area. As an automated system however this didn&#8217;t really deter and avoids the false positive.</p>
<h1>I know you! You&#8217;re dead! We killed you</h1>
<p>Running this set of Macros on 1 project (our lowest level systems/hardware layer) garnered pretty good results. 30% of our headers were removed in 1 swoop using a purely brute force method. It took 9 hours in total and saved a respectable 10% off of the build of said project on PC/360.</p>
<p>This technique ultimately rests on the ability to cut off an entire tree of headers from the cpp, as such it is at the mercy of the programmers willingness and ability to setup headers in a form that reduces code dependencies whilst providing the functionality the client requires. Next i will be looking into include hierarchy and how we can analyze the cost of a tree with a mind towards culling off branches and reducing the implicit dependencies we create.</p>
<h2>Series</h2>
<p><a href="http://altdevblogaday.com/2011/09/20/codebuild-optimisation1/" target="_blank">Part 1</a>&nbsp;&nbsp;<a href="http://altdevblogaday.com/2011/11/04/code-build-optimization-part-2/" target="_blank">Part 2</a>&nbsp;&nbsp;<a href="http://altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/" target="_blank">Part 3</a>&nbsp;<a href="http://altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/" target="_blank">Part 4</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/11/04/code-build-optimization-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Code Build Optimisation Part 1</title>
		<link>http://www.altdevblogaday.com/2011/09/20/codebuild-optimisation1/</link>
		<comments>http://www.altdevblogaday.com/2011/09/20/codebuild-optimisation1/#comments</comments>
		<pubDate>Tue, 20 Sep 2011 13:15:46 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[#gamedev]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[build]]></category>
		<category><![CDATA[file IO]]></category>
		<category><![CDATA[headers]]></category>
		<category><![CDATA[optimisation]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=16557</guid>
		<description><![CDATA[<p>Its 10:30am, you&#8217;ve just arrived at work and your nightly build is telling you all is well; you are ready to start the days tasks. Just as you sit down Bug #497564 comes in, its urgent and its new as of the 9:52am build, you have to get latest (groan). You start to wonder what changes went in since your 4am build started. Did that crazy guy who&#8217;s always in at 5am do something radical like switching out the entire math system again? You hover the mouse over the &#60;sync&#62; button and wonder if you can diagnose the issue without syncing. Dammit no. You click sync and fail to hold back the gasp of horror as you see &#60;THAT FILE&#62; go flying by on the source control TTY, the one that everyone winces at resulting in 30 min build time MINIMUM!, time to go get coffee while muttering &#8220;why does that damned build take so long&#8221;.</p>
<p><a href="http://www.altdevblogaday.com/2011/09/20/codebuild-optimisation1/" class="more-link">Read more on Code Build Optimisation Part 1&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>Its 10:30am, you&#8217;ve just arrived at work and your nightly build is telling you all is well; you are ready to start the days tasks. Just as you sit down Bug #497564 comes in, its urgent and its new as of the 9:52am build, you have to get latest (groan). You start to wonder what changes went in since your 4am build started. Did that crazy guy who&#8217;s always in at 5am do something radical like switching out the entire math system again? You hover the mouse over the &lt;sync&gt; button and wonder if you can diagnose the issue without syncing. Dammit no. You click sync and fail to hold back the gasp of horror as you see &lt;THAT FILE&gt; go flying by on the source control TTY, the one that everyone winces at resulting in 30 min build time MINIMUM!, time to go get coffee while muttering &#8220;why does that damned build take so long&#8221;.</p>
<p>It has always been surprising to me that many programmers don&#8217;t delve into the reasons why certain processes take so long but we&#8217;ll spend weeks/months optimizing the code itself to extract that last 1ms (guilty as charged). What became obvious however once i talked to the few that had tried, was that the tools available to aid this task are somewhat dire and often have issues with the scale of our projects. So I delved deeper and this is what i found.</p>
<p>We have 51 targets (libraries, exe&#8217;s, dlls), spread across 11,000+ files ranging from 5-10mins to 100,000+ lines. A full clean build of all our code across the targets we&#8217;re interested in can hit ~3hours on a single machine.</p>
<h1>These aren&#8217;t the droids you&#8217;re looking for</h1>
<p>The assumption from 9 out of 10 programmers i talked to is that file IO is the root cause of a slow build. Ergo disabling Virus Checking, getting a faster Hard Drive, switching to SSD or using a RAM drive must be the way we speed it up. This sounded fantastically sensible so i investigated speeding up these devices. I chose our mildest target which showed a build time around 7mins +- 5seconds. My laptop already has both an SSD &amp; SATA installed as well as having 12Gb RAM, so i spent some time running tests&#8230; I won&#8217;t bore you with the details but needless to say i confirmed in each case that the circumstances were well setup and spent a full day running builds of various forms in various locations moving around temporary files, object files, source locations, splitting across drives for temp vs target, disabling virus checking&#8230; it was a very busy day for my poor laptop. Sadly the net effect to all these changes was&#8230; ZERO. Build time remained steady at m7+-10s for ALL but 1 test which was RAM Drive for source on a fresh reboot vs SSD for source fresh reboot; this is an obvious case as the ram drive would offset the windows cache. Even then the difference was nominal (7m30 for SSD, 6m59s for RD)</p>
<p>It was at this point I realized i&#8217;d made the cardinal mistake when optimizing; I must find the problem before attempting a solution.</p>
<h1>Open your mind</h1>
<p>At this years gamefest Bruce Dawson gave a great talk on a tool i&#8217;ve attempted (and failed) to use previously: XPerf. Thankfully his first 5mins was spent lamenting just HOW difficult the tool is to use when you are not a part of Microsoft and can&#8217;t simply email the dudes who would know the correct settings. His talk was &#8220;How Valve Makes games better using XPerf&#8221; and recounted what he did, how he did it and ultimately what he thought other studios should know about it and how we can improve upon it. The net gain was a much greater understanding of the tool AND BATCH FILES!!!. Magical batch files that setup all the internal components required by XPerf in order to provide decent information.</p>
<p>I used this sorcery to profile our code build, 10 mins later I had 13Gb of data and my mind is now open. File IO is the problem, but not in the assumed way. Yes there is a lot of reading and writing of data, but looking at the profile info it seems that most time is taken up with WAITING for a write while another write is progress. Most notably (~90% of cases) this was with the mspdbsrv process which seem to be attempting to service writes from many location to the same file. My theory is that the more concurrent builds (targets build concurrently at the project level for VS2010) the more the requests for write to the PDB back up and eventually the parent process has to stall which stalls the entire pipeline. In my test the Mspdbsrv process is occupying approximately 40% of the CPU time the build uses with ~64 CL&#8217;s in flight.</p>
<p>Now i&#8217;m obviously a novice when using XPerf so i could easily be reading the information incorrectly.</p>
<h1>I&#8217;ll be back</h1>
<p>Using this information one of our engineers spent some time looking into splitting up the PDB&#8217;s themselves. The setup he wrote essentially creates a pdb for every &#8220;N&#8221; files and his cursory tests shows a significant performance increase in build from doing this. It has not yet been officially timed. In later parts of this series i&#8217;ll detail the effects of this change (and others).</p>
<h1>Where we&#8217;re going, we don&#8217;t need roads</h1>
<p>There are obviously more brute force methods of optimizing the latency between a code change and a built target however i believe that starting as you mean to go on and actually learning the core issues is worthwhile to any process like this. The subjects i&#8217;ll be looking into are:</p>
<ul>
<li>hierarchical header optimization</li>
<li>simple header removal</li>
<li>pro-active dead stripping</li>
<li>incredibuild XGE</li>
</ul>
<p>please feel free to share any information you feel might help as we push forward on this project.</p>
<h2>Series</h2>
<p><a href="http://altdevblogaday.com/2011/09/20/codebuild-optimisation1/" target="_blank">Part 1</a>  <a href="http://altdevblogaday.com/2011/11/04/code-build-optimization-part-2/" target="_blank">Part 2</a>  <a href="http://altdevblogaday.com/2011/11/21/code-build-optimisation-part-3/" target="_blank">Part 3</a> <a href="http://altdevblogaday.com/2011/12/26/code-build-optimisation-part-4-incremental-linking-and-the-search-for-the-holy-grail/" target="_blank">Part 4</a></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/09/20/codebuild-optimisation1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Practical Floating Point</title>
		<link>http://www.altdevblogaday.com/2011/08/21/practical-flt-point-tricks/</link>
		<comments>http://www.altdevblogaday.com/2011/08/21/practical-flt-point-tricks/#comments</comments>
		<pubDate>Sun, 21 Aug 2011 21:20:28 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[Education]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[divide by zero]]></category>
		<category><![CDATA[floating point]]></category>
		<category><![CDATA[floating point tricks]]></category>
		<category><![CDATA[normalize]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=13673</guid>
		<description><![CDATA[<p>Floating point is ubiquitous in programming these days. Hardware has improved to the point where in many environments it is actually faster to use floating point as apposed to integer (this wasn&#8217;t the case a decade ago). This post will attempt to educate the reader on various &#8220;tricks&#8221; that can help push floating point performance &#38; relative accuracy even further and allow the programmer to avoid some of the pitfalls I have fallen into over the years.</p>
<p><a href="http://www.altdevblogaday.com/2011/08/21/practical-flt-point-tricks/" class="more-link">Read more on Practical Floating Point&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>Floating point is ubiquitous in programming these days. Hardware has improved to the point where in many environments it is actually faster to use floating point as apposed to integer (this wasn&#8217;t the case a decade ago). This post will attempt to educate the reader on various &#8220;tricks&#8221; that can help push floating point performance &amp; relative accuracy even further and allow the programmer to avoid some of the pitfalls I have fallen into over the years.</p>
<p>This article will assume <a href="http://www.mikeash.com/pyblog/friday-qa-2011-01-04-practical-floating-point.html">basic knowledge of floating point numbers</a></p>
<p>So we&#8217;ve all used them, likely for all and sundry within our games/tools so heres a question to ask yourself&#8230; what does the code below print out for the &#8220;test&#8221; variable?</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>{
    float small_value= 1.0f;
    float smaller_value= 1 / 100000000.0f;
    float test= small_value + smaller_value;
    printf(&quot;{%g,%g} =&gt; {0x%08x,0x%08x}\n&quot;, small_value, test, *(int*)&amp;small_value, *(int*)&amp;test);
}
</code></pre>
<p>Almost all programmers I spoke to assumed the value of test to be 1.00000001f and of course mathematically it should be, however it is not, the print out would show.</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>{1,1} =&gt; {0x3f800000, 0x3f800000}
</code></pre>
<p>Single precision floating point simply cannot represent the accuracy the result requires and as such the result is kept at 1.0f.</p>
<p>Now if you are mathematically minded this will likely poke at your OCD gene and &#8220;force&#8221; you into using doubles. This is a perfectly palatable solution in many situations where one does not require speed however double still suffers the same fate if you double the number of zeros in the denominator (1 / 10000000000000000.0f;). </p>
<p>For those of us who work in games math is important, but its far less important than simulation determinism and getting things done expediently whilst executing within performance budgets. This forces us to make decisions that might otherwise be seen as somewhat mathematically incorrect. The above is one such circumstance. In most games applications we cannot afford to switch math to use doubles and as such the inherent limitations of single precision floating point math are deemed acceptable, even becoming &#8220;normal&#8221;. I&#8217;ve been working with them so long that it is now natural to apply these limitations in all circumstances as par for the course.</p>
<p>As was mentioned in the <a href="http://www.mikeash.com/pyblog/friday-qa-2011-01-04-practical-floating-point.html">link</a> i posted above comparing of floating point numbers using == is only sensible in a purely deterministic (read: constant) form. If math is being used to generate the values to be compared then great care has to be taken; great care is rarely taken. The general rule in most studios is simple: don&#8217;t compare floats using == on pain of death or public embarrassment. The usual method is to apply some form of epsilon to the comparison</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>{
    float a= 1.0f;
    float b= 1.0005f;
    static const float epsilon= 0.0001f;

    float temp= fabs(b - a);

    if(temp &gt; epsilon)
    {
        // not the same
    }
}
</code></pre>
<p>Within games this is one method used to enforce determinism; apply a decent epsilon and most math &#8220;behaves&#8221; (this doesn&#8217;t mean its accurate mind). One complication however is that epsilon is not always obvious nor can it always be constant, especially within helper classes such as Vector Math. </p>
<p>A good epsilon is usually dependent upon the data you&#8217;re representing. If you are dealing in world coordinates for instance where 1.0f = 1m then </p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>0.01    = 1cm
0.001   = 1mm
0.0001  = 100&#181;m
0.00001 = 10&#181;m
</code></pre>
<p>Many of the games i&#8217;ve worked on use 1mm as their positional epsilon (0.001f) and 10&#181;m for directional/rotational epsilon (0.00001f). For Positional Epsilon this provides an effective range of approximately +/- 10000.001 and for rotational epsilon +/- 1000.0001f.</p>
<p>The general rule of thumb i use is, 8 decimal places between your highest and lowest accuracy requirements. If you feel you need a larger range than this but still want accuracy then consider using a reference frame; 1 value to represent low accuracy high values (say kilometers) and another to represent high accuracy low values (meters down to &#181;m)</p>
<p>Now one area that seems to catch a lot of people out (myself included on many occasions) is Infinity &amp; NaN Rules&#8230; so here is a handy table borrowed from <a href="http://users.tkk.fi/jhi/infnan.html">here</a>, his looks prettier.</p>
<table border="1">
<tr>
<td>
<table border="1">
<tr>
<td class="oph">+</td>
<td class="nih">-Inf</td>
<td class="nfh">-1</td>
<td class="nzh">-0</td>
<td class="pzh">0</td>
<td class="pfh">1</td>
<td class="pih">Inf</td>
<td class="nah">NaN</td>
</tr>
<tr>
<td class="nih">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nfh">-1</td>
<td class="ni">-Inf</td>
<td class="nf">-2</td>
<td class="nf">-1</td>
<td class="nf">-1</td>
<td class="pz">0</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nzh">-0</td>
<td class="ni">-Inf</td>
<td class="nf">-1</td>
<td class="nz">-0</td>
<td class="pz">0</td>
<td class="pf">1</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pzh">0</td>
<td class="ni">-Inf</td>
<td class="nf">-1</td>
<td class="pz">0</td>
<td class="pz">0</td>
<td class="pf">1</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pfh">1</td>
<td class="ni">-Inf</td>
<td class="pz">0</td>
<td class="pf">1</td>
<td class="pf">1</td>
<td class="pf">2</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pih">Inf</td>
<td class="na">NaN</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nah">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
</table>
</td>
<td>
<table border="1">
<tr>
<td class="oph">-</td>
<td class="nih">-Inf</td>
<td class="nfh">-1</td>
<td class="nzh">-0</td>
<td class="pzh">0</td>
<td class="pfh">1</td>
<td class="pih">Inf</td>
<td class="nah">NaN</td>
</tr>
<tr>
<td class="nih">-Inf</td>
<td class="na">NaN</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nfh">-1</td>
<td class="pi">Inf</td>
<td class="pz">0</td>
<td class="nf">-1</td>
<td class="nf">-1</td>
<td class="nf">-2</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nzh">-0</td>
<td class="pi">Inf</td>
<td class="pf">1</td>
<td class="pz">0</td>
<td class="nz">-0</td>
<td class="nf">-1</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pzh">0</td>
<td class="pi">Inf</td>
<td class="pf">1</td>
<td class="pz">0</td>
<td class="pz">0</td>
<td class="nf">-1</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pfh">1</td>
<td class="pi">Inf</td>
<td class="pf">2</td>
<td class="pf">1</td>
<td class="pf">1</td>
<td class="pz">0</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pih">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nah">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
</table>
</td>
</tr>
<tr>
<td>
<table border="1">
<tr>
<td class="oph">*</td>
<td class="nih">-Inf</td>
<td class="nfh">-1</td>
<td class="nzh">-0</td>
<td class="pzh">0</td>
<td class="pfh">1</td>
<td class="pih">Inf</td>
<td class="nah">NaN</td>
</tr>
<tr>
<td class="nih">-Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nfh">-1</td>
<td class="pi">Inf</td>
<td class="pf">1</td>
<td class="pz">0</td>
<td class="nz">-0</td>
<td class="nf">-1</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nzh">-0</td>
<td class="na">NaN</td>
<td class="pz">0</td>
<td class="pz">0</td>
<td class="nz">-0</td>
<td class="nz">-0</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pzh">0</td>
<td class="na">NaN</td>
<td class="nz">-0</td>
<td class="nz">-0</td>
<td class="pz">0</td>
<td class="pz">0</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pfh">1</td>
<td class="ni">-Inf</td>
<td class="nf">-1</td>
<td class="nz">-0</td>
<td class="pz">0</td>
<td class="pf">1</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pih">Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nah">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
</table>
</td>
<td>
<table border="1">
<tr>
<td class="oph">/</td>
<td class="nih">-Inf</td>
<td class="nfh">-1</td>
<td class="nzh">-0</td>
<td class="pzh">0</td>
<td class="pfh">1</td>
<td class="pih">Inf</td>
<td class="nah">NaN</td>
</tr>
<tr>
<td class="nih">-Inf</td>
<td class="na">NaN</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nfh">-1</td>
<td class="pz">0</td>
<td class="pf">1</td>
<td class="pi">Inf</td>
<td class="ni">-Inf</td>
<td class="nf">-1</td>
<td class="nz">-0</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nzh">-0</td>
<td class="pz">0</td>
<td class="pz">0</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="nz">-0</td>
<td class="nz">-0</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pzh">0</td>
<td class="nz">-0</td>
<td class="nz">-0</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="pz">0</td>
<td class="pz">0</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pfh">1</td>
<td class="nz">-0</td>
<td class="nf">-1</td>
<td class="ni">-Inf</td>
<td class="pi">Inf</td>
<td class="pf">1</td>
<td class="pz">0</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="pih">Inf</td>
<td class="na">NaN</td>
<td class="ni">-Inf</td>
<td class="ni">-Inf</td>
<td class="pi">Inf</td>
<td class="pi">Inf</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
<tr>
<td class="nah">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
<td class="na">NaN</td>
</tr>
</table>
</td>
</tr>
</table>
<p>On the subject of error handling consider the standard method of normalizing a 3d vector.</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>vector3 normalize_vector(const vector3 &amp; vec)
{
    float length= vector_length(vec);
    vector3 result=vec;

    if(length &gt; 0.0f)
    {
        float reciprocal= 1.0f / length;

        result.x *= reciprocal;
        result.y *= reciprocal;
        result.z *= reciprocal;
    }

    return result;
}
</code></pre>
<p>We want to avoid introducing a problem by way of an INF=&gt;NAN in the returned data or throwing an exception so a branch is inserted. This effectively removes the divide by zero problem however at great cost; on many platforms the branch is an instruction cache flush resulting in significant performance issues. The problem is there really isn&#8217;t another way to achieve the same avoidance mathematically.</p>
<p>There is however a method of avoiding it if you rely upon floating point math. We&#8217;ve established that a large value remains the same when a small value is added, we&#8217;ve also discussed that the effective range of floating point values for use in games is limited both for determinism and to avoid issues with the math itself. Combining the 2 provides a rather elegant method of avoiding the divide by zero issue under normalization and many other well known situations.</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>const float very_small_float= 1.0e-037f;
vector3 normalize_vector_2(const vector3 &amp; vec)
{
    float length= very_small_float + vector_length(vec);

    float reciprocal= 1.0f / length;

    vector3 result;

    result.x = vec.x * reciprocal;
    result.y = vec.y * reciprocal;
    result.z = vec.z * reciprocal;

    return result;
}
</code></pre>
<p>Due to limiting the allowed range of vector3 components and understanding that addition of the very small value to any value larger than 1.0e-29 has zero effect we have effectively removed the possibility of receiving INF and thus NAN as the result. (The Denormal length is still possible however much less likely)</p>
<p>There is a plethora of information out there on floating point both practical and theoretical however the above is my attempt to represent those cases not covered or not well highlighted and hopefully make people think about their implementations more in terms of the specific requirements and less in terms of &#8220;floating point numbers handle everything&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/08/21/practical-flt-point-tricks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The demise of the low level Programmer.</title>
		<link>http://www.altdevblogaday.com/2011/08/06/demise-low-level-programmer/</link>
		<comments>http://www.altdevblogaday.com/2011/08/06/demise-low-level-programmer/#comments</comments>
		<pubDate>Sat, 06 Aug 2011 18:45:30 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[#gamedev]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[low level]]></category>
		<category><![CDATA[old school]]></category>
		<category><![CDATA[optimization]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=13450</guid>
		<description><![CDATA[<p>When I started programming many of the elements we take for granted now, did not exist. There was no DirectX and not many compatible libs were available for the free compilers of the day. So I had to write my own code for most basic programs, keyboard handlers, mouse handlers, video memory accessors, rasterizers, texture mappers, blitters&#8230; the programs I wrote then were 100% my own code and I had to be able to handle anything and everything.</p>
<p><a href="http://www.altdevblogaday.com/2011/08/06/demise-low-level-programmer/" class="more-link">Read more on The demise of the low level Programmer&#8230;.</a></p>
]]></description>
			<content:encoded><![CDATA[<p>When I started programming many of the elements we take for granted now, did not exist. There was no DirectX and not many compatible libs were available for the free compilers of the day. So I had to write my own code for most basic programs, keyboard handlers, mouse handlers, video memory accessors, rasterizers, texture mappers, blitters&#8230; the programs I wrote then were 100% my own code and I had to be able to handle anything and everything.</p>
<p>Personally I&#8217;ve always been interested in what was going on under the hood so this suited me just fine. I always dug into the details and I almost always end up programming as close to the bone ON the hardware (or OS) as I possibly can both to eek out as much performance as possible AND to satisfy my own hunger for knowledge.</p>
<p>Combine the two and what you get today is someone who enjoys spending 5 days making that single function 20x faster, who revels in reducing the memory footprint of the primary data structure by 1 byte per element across the entire program whilst simultaneously writing a pre-caching system to avoid the special case issues&#8230;. who&#8230; well you get the picture&#8230; I&#8217;m an OCD level sport optimizing geek. To such a degree that I keep a notepad by my bed to take notes when I wake up from a &#8220;bug fix&#8221; dream or a &#8220;eureka optimization&#8221; doze&#8230;yes i&#8217;m that sick.</p>
<p>Over the last decade I&#8217;ve been involved in the hiring process at many studios and in more recent years I&#8217;ve noticed a pattern. Knowledge of what is generally considered &#8220;low-level&#8221; programming is waning. Many programmers know enough to get through a C# or C++ test, but don&#8217;t understand something as basic (and important) as the behavior of memory or god forbid a cache. They don&#8217;t seem to grasp that one must understand the native environment you&#8217;re working in before going ahead and writing a program to run within it. The intricacies of floating point vs fixed point math are completely lost on them as the term &#8220;fixed point&#8221; brings about a blank stare; floating point numbers are best right?. I once mentioned bit shifting to an experienced engineer of 10 years and was devastated by the complete lack of basic understanding.</p>
<p>It depresses me that so much of what I consider to be essential is simply not being taught anymore. I&#8217;m not talking about assembly language per se; even those of us who used to spend hours writing assembly now more often opt to use intrinsics built into compilers to avoid the stress and complication. What I&#8217;m talking about is simply the understanding of WHAT is happening when someone does i++ and not ++i, why one might opt to stripe a memory copy/set in certain circumstances.</p>
<p>So here goes&#8230; a list of things I believe all console programmers (and recommend to all programmers as good reading) should fully understand with links to educate where possible. (feel free to suggest more/better links)</p>
<ul>
<li>Floating Point Numbers</li>
<ul>
<li>They are very useful but often used in situations where they simply don&#8217;t suit the solution the programmer is attempting to write. The following links should provide some background and info on where they are not so useful, what the pitfalls are and sometimes even how to avoid them.</li>
<li>http://www.cprogramming.com/tutorial.html#fptutorial</li>
<li>http://www.johndcook.com/blog/2009/04/06/numbers-are-a-leaky-abstraction/</li>
<li>http://www.codeproject.com/KB/recipes/float_point.aspx</li>
<li>http://drdobbs.com/184402741?pgno=4</li>
<li>http://users.tkk.fi/jhi/infnan.html</li>
</ul>
<li>Fixed Point Numbers</li>
<ul>
<li>Fixed point math is mildly old school but it is VERY useful both to understand its makeup and to use. Sadly because it is considered old school many of the online sources are out of date.</li>
<li>http://x86asm.net/articles/fixed-point-arithmetic-and-tricks/</li>
<li>http://gameprogrammer.com/4-fixed.html</li>
</ul>
<li>Processor Cache Behavior / Memory</li>
<ul>
<li>http://www.akkadia.org/drepper/cpumemory.pdf</li>
<li>http://en.wikipedia.org/wiki/CPU_cache</li>
<li>http://igoro.com/archive/gallery-of-processor-cache-effects/</li>
</ul>
<li>Bit Shifting</li>
<ul>
<li>http://www.cprogramming.com/tutorial/bitwise_operators.html</li>
<li>useful hacks (use carefully)</li>
<li>http://graphics.stanford.edu/~seander/bithacks.html</li>
<li>http://stackoverflow.com/questions/539836/emulating-variable-bit-shift-using-only-constant-shifts</li>
<li>http://guru.multimedia.cx/avoiding-branchesifconditionals/</li>
</ul>
<li>Branch Prediction</li>
<ul>
<li>This may be lower level that people think they need to go&#8230; but they&#8217;d be wrong. Understanding how the hardware you&#8217;re programming for treats branches can affect performance to a HUGE degree&#8230; far more than most programmers may appreciate re: <em>death</em> by a thousand <em>cuts</em></li>
<li>http://cellperformance.beyond3d.com/articles/2006/04/background-on-branching.html</li>
<li>http://igoro.com/archive/fast-and-slow-if-statements-branch-prediction-in-modern-processors/</li>
<li>http://www.k8gu.com/ece.umn.edu/documents/classes/ece362-branch-prediction.pdf</li>
<li><cite></cite>http://www.cs.ucr.edu/~gupta/teaching/203A-09/My6.pdf</li>
</ul>
<li>Sorting</li>
<ul>
<li>This isn&#8217;t really low-level but something i consider &#8220;basic&#8221; and its an area where many programmers are simply lacking in understanding. Do yourself a favor and play around with this link, read the links it sends you to for each algorithm and try to grasp when each might be used, the properties as described and the next time you need to sort something&#8230; consult it.</li>
<li>http://www.sorting-algorithms.com/</li>
<li>another good link with sub links: http://corte.si//posts/code/visualisingsorting/index.html</li>
<li>a funny one (but still strangely useful), the bubble sort dance: http://www.youtube.com/watch?v=lyZQPjUT5B4</li>
</ul>
</ul>
<p>So there you go, I&#8217;ve likely missed some aspects that should make the list but if you can grasp the above then you&#8217;re more likely to get noticed at interviews and you&#8217;ll certainly be a better programmer for it.</p>
<blockquote><p>&nbsp;</p>
<p>Edit 8/8/2011 &#8211; It seems I didn&#8217;t do a very good job at explaining the core audience for the article, I apologize for that. The programmers I would like to see learning and absorbing these &#8220;low level&#8221; details are those who would see themselves working in console games. The HW is fixed for long periods of the time and resources become scarce usually within 1 game cycle; understanding how to better utilize resources becomes a key part of the job.</p>
<p>While i believe that most programmers would benefit from understanding more about the medium they work in it is certainly not required</p>
<p>Hopefully you enjoyed the article in the spirit it was intended.</p>
<p>Andy Firth</p></blockquote>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/08/06/demise-low-level-programmer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is an &#8220;Engineer Architect&#8221;</title>
		<link>http://www.altdevblogaday.com/2011/07/22/what-is-an-engineer-architect/</link>
		<comments>http://www.altdevblogaday.com/2011/07/22/what-is-an-engineer-architect/#comments</comments>
		<pubDate>Fri, 22 Jul 2011 16:18:30 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[#gamedev]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[architect]]></category>
		<category><![CDATA[engine]]></category>
		<category><![CDATA[low level]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=12194</guid>
		<description><![CDATA[<p>A question that i get asked rather a lot is &#8220;What is an Engineer Architect&#8221;</p>
<blockquote><p>
<b>en·gi·neer</b><br />
noun /ˌenjəˈni(ə)r/ <br />
A person who designs, builds, or maintains engines, machines, or public works<br />
<b><br />
ar·chi·tect</b><br />
noun /ˈärkiˌtekt/ <br />
((computer science) the structure and organization of a computer&#8217;s hardware or system software) &#8220;the architecture of a computer&#8217;s system software&#8221;</p></blockquote>
<p><a href="http://www.altdevblogaday.com/2011/07/22/what-is-an-engineer-architect/" class="more-link">Read more on What is an &#8220;Engineer Architect&#8221;&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>A question that i get asked rather a lot is &#8220;What is an Engineer Architect&#8221;</p>
<blockquote><p>
<b>en·gi·neer</b><br />
noun /ˌenjəˈni(ə)r/ <br />
A person who designs, builds, or maintains engines, machines, or public works<br />
<b><br />
ar·chi·tect</b><br />
noun /ˈärkiˌtekt/ <br />
((computer science) the structure and organization of a computer&#8217;s hardware or system software) &#8220;the architecture of a computer&#8217;s system software&#8221;</p></blockquote>
<p>This basically translates to the same thing all engineers do: </p>
<ul>
<li>Analyze the problem</li>
<li>Formulate a possible set of solutions </li>
<li>Analyze the solutions</li>
<li>Decide what to implement </li>
<li>Implement </li>
<li>Test/bug fix</li>
<li>Profile/sanitize</li>
<li>Repeat until satisfied </li>
</ul>
<p>however with one important difference. My role involves the entire program, how its parts interact and how they are designed, what their dependencies are and how they affect the final output and the performance of that output. Combine this with data parallel infrastructure, task parallel objects, multiple platforms, myriad hardware limitations and a large programming team &#8230; under many circumstances one might imagine something akin to </p>
<div class="separator" style="clear: both;text-align: center">
<a href="http://2.bp.blogspot.com/-5Z2NdBq6VRA/TWX1VXrFszI/AAAAAAAABmQ/bbR-kD-TQow/s400/spaghetti-head.jpg"><img border="0" height="400" src="http://2.bp.blogspot.com/-5Z2NdBq6VRA/TWX1VXrFszI/AAAAAAAABmQ/bbR-kD-TQow/s400/spaghetti-head.jpg" width="297" /></a></div>
<p>and you often wouldn&#8217;t be too far from the truth on a daily basis. Long term however we (the team) have a plan and a set of goals. Over time we re-assess based on those we hit and those we don&#8217;t, new requirements and ultimately how the game is progressing and where it needs to go. The role I play in this is simply to ensure that under the stress and strain of day to day development, we&#8217;re still aiming in the right direction as a whole. That decisions take into account as much of the big picture as possible and weigh that against the immediate requirements of the current goal. There are several people in similar roles at Bungie and our interactions provide a simple but effective method of applying &#8220;Checks &amp; Balances&#8221; to the progress we make; we each bring our own flavor to the table.</p>
<p>Day to day this involves tasks such as</p>
<ul>
<li>Advising/Teaching on how to handle concurrency</li>
<li>Long term interface design</li>
<li>Short term prototyping/bug fix hacks</li>
<li>Discussing/Advising on future platforms</li>
<li>Optimizing programmer iteration, debugging &amp; general workflow</li>
<li>Managing external teams</li>
<li>Auditioning Middleware</li>
<li>designing/writing/debugging infrastructure systems</li>
</ul>
<p>Technically i&#8217;m a member of Bungies &#8220;Infrastructure&#8221; team. This means that if its something &#8220;unexciting&#8221;, &#8220;behind the scenes&#8221; or seemingly doesn&#8217;t affect the final game at all&#8230; we handle it. This involves systems like</p>
<ul>
<li>Memory Allocation</li>
<li>File System</li>
<li>Network Transport</li>
<li>Crash Handling, Minidumps</li>
<li>Debugger Plugins</li>
<li>Multi-threading Infrastructure &amp; Architecture</li>
<li>Asset import/baking</li>
<li>Math Library</li>
<li>Schematization/Reflection</li>
<li>Audio Engine</li>
<li>Container Classes</li>
<li>Profiler Infrastructure</li>
<li>Compiler Configuration</li>
<li>Low &amp; High level Optimization</li>
<li>Build systems</li>
<li>Flux Capacitor Maintenance</li>
<li> &#8211; you still reading?</li>
</ul>
<p>as i said&#8230; the stuff most most programmers find tiresome and boring. Our team love this stuff and we&#8217;re good at it.</p>
<p>So there you have it, a much better idea of what i do @ Bungie and hopefully a guide to those in school who might want to progress towards a similar role (or avoid it).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/07/22/what-is-an-engineer-architect/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Writing a &#8220;Pre-Main&#8221; function &#8211; Forcing Global Initialization order within VC++</title>
		<link>http://www.altdevblogaday.com/2011/07/07/writing-a-pre-main-function-forcing-global-initialization-order-within-vc/</link>
		<comments>http://www.altdevblogaday.com/2011/07/07/writing-a-pre-main-function-forcing-global-initialization-order-within-vc/#comments</comments>
		<pubDate>Thu, 07 Jul 2011 06:24:56 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[C++ Initialization order]]></category>
		<category><![CDATA[init_seg]]></category>
		<category><![CDATA[Pre Main]]></category>

		<guid isPermaLink="false">http://altdevblogaday.com/?p=10654</guid>
		<description><![CDATA[<p>One rule of C/C++ programming that bites everyone eventually is that initialization order of global variables across compilation units is not guaranteed. I&#8217;ve seen programs with global variable dependencies run fine for years then suddenly develop &#8220;issues&#8221; resulting in a hard lock before main(..) is hit. I&#8217;ve seen other programs have dependency issues but not actually crash resulting in subtle bugs that never seem to negatively affect run-time&#8230; until they do, usually on the day before gold master when the lead programmer is in Peru.</p>
<p><a href="http://www.altdevblogaday.com/2011/07/07/writing-a-pre-main-function-forcing-global-initialization-order-within-vc/" class="more-link">Read more on Writing a &#8220;Pre-Main&#8221; function &#8211; Forcing Global Initialization order within VC++&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>One rule of C/C++ programming that bites everyone eventually is that initialization order of global variables across compilation units is not guaranteed. I&#8217;ve seen programs with global variable dependencies run fine for years then suddenly develop &#8220;issues&#8221; resulting in a hard lock before main(..) is hit. I&#8217;ve seen other programs have dependency issues but not actually crash resulting in subtle bugs that never seem to negatively affect run-time&#8230; until they do, usually on the day before gold master when the lead programmer is in Peru.</p>
<p>So here&#8217;s the situation, you want to create objects &#8220;a&#8221; through &#8220;g&#8221; in various files who register themselves with a manager into a list on construction, a relatively common setup and relatively simple within a single compilation module; for VC++ define them from bottom to top manager first and it will work. However across compilation modules things are not so simple and more often than not the manager will construct itself at a very inopportune time; likely after some objects have registered themselves and before all objects have. </p>
<p>Without simply knowing that the manager will initialize first there really isn&#8217;t a &#8220;Good&#8221; solution tho there are several lazy initialization techniques.</p>
<p>So here&#8217;s where things get &#8220;fun&#8221; and by &#8220;fun&#8221; i mean, people look at the code and get confused looks on their face.</p>
<p>Put the following code into a cpp file on its own and add it to your project of choice.</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>#pragma init_seg( ".CRT$XCB" )

class c_blog_first_class_construction
{
public:
    c_blog_first_class_construction()
    {
        printf("first class construction\n");
    }

    ~c_blog_first_class_construction() {};
};

static c_blog_first_class_construction blog_first_class_construction;
</code></pre>
<p>the constructor for c_blog_first_class_construction will be called before any other constructor in your code (assuming you have nothing else like this in there).</p>
<p>NOTE: Expect C4075: initializers put in unrecognized initialization area, disable it if you feel the need.</p>
<p>the key to the code is</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>#pragma init_seg(...)
</code></pre>
<p>for more info on what this does see <a href="http://support.microsoft.com/kb/104248">kb104248<br />
</a><br />
The crux of it however is we are naming this compilation unit with a specific identifier, when the linker reads various .CRT groups, it combines them in one section and orders them alphabetically. This means that the user-defined global initializers (which the Visual C++ compiler puts in .CRT$XCU) will always come after CRT$XCA and before .CRT$XCZ. </p>
<p>So we are depending linker to do the right thing and insert our section into the CRT sorted sections (verified in VS2010 as of 7/6/2011 for x86,x64,x360). It works and its very useful.</p>
<p>Now if you want to get really fancy&#8230; make the constructor for c_blog_first_class_construction call a function that initializes your &#8220;Pre construction&#8221; requirements engine wide (usually memory management, assert systems).</p>
<p>For other platforms (GCC/SNC are the only others i use) there are other options such as &#8220;init_priority&#8221; which is a per instance attribute. These can be used to achieve the same result but on a more granular level.</p>
<p>It should be noted that this is not a method that one should use all over a codebase, personally i only ever use it on one compilation module and it is always VERY well commented.</p>
<p>enjoy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/07/07/writing-a-pre-main-function-forcing-global-initialization-order-within-vc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Extending the Watching Window in Visual Studio via a Debugger Addin</title>
		<link>http://www.altdevblogaday.com/2011/06/22/extending-the-watching-window-in-visual-studio-via-a-debugger-addin/</link>
		<comments>http://www.altdevblogaday.com/2011/06/22/extending-the-watching-window-in-visual-studio-via-a-debugger-addin/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 02:07:18 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://altdevblogaday.org/?p=8832</guid>
		<description><![CDATA[<p>So previously we established how to use autoexp.dat for <a href="http://andyfirth.blogspot.com/2011/05/becoming-console-programmer-extending.html">simple format changes</a> and how to <a href="http://andyfirth.blogspot.com/2011/05/becoming-console-programmer-extending_21.html">use a DLL</a> to achieve those changes. There are major limitations to the DLL approach however in that we cannot access global symbols and thus any system that relies on global state for debug (as many do) cannot be debugged.</p>
<p><a href="http://www.altdevblogaday.com/2011/06/22/extending-the-watching-window-in-visual-studio-via-a-debugger-addin/" class="more-link">Read more on Extending the Watching Window in Visual Studio via a Debugger Addin&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>So previously we established how to use autoexp.dat for <a href="http://andyfirth.blogspot.com/2011/05/becoming-console-programmer-extending.html">simple format changes</a> and how to <a href="http://andyfirth.blogspot.com/2011/05/becoming-console-programmer-extending_21.html">use a DLL</a> to achieve those changes. There are major limitations to the DLL approach however in that we cannot access global symbols and thus any system that relies on global state for debug (as many do) cannot be debugged.</p>
<p>Here I will discuss how to create a VS2010 plugin using C# (plugins also work for VS2008 however not using C#).</p>
<p>The <a href="http://msdn.microsoft.com/en-us/library/envdte._dte.aspx">Development Tools Environment</a> framework was setup around a decade ago to allow Visual Studio extensibility. I would recommend that any programmers read through what is possible using DTE however i&#8217;m not going to invest much time into that here; I will simply use one aspect of it to achieve our goal: global symbol access.</p>
<p>VS2010 has a great Wizard to help us here, Select <i>File-&gt;New-&gt;Project</i> then on the left hand side go to <i>Other Project Types-&gt;Extensibility-&gt;Visual Studio Addin</i>, click next and then choose option (default) <i>Create an Add-in using Visual C#</i>, leave application host as default, then give your addin a recognizable name. Set your addin to load when host application starts. The rest is default.</p>
<p>This provides us with a very simple addin with a LOT of C# goodness and very little code. I should note that when i wrote this project it was my first foray into C# and thus my knowledge in this area is shady at best, i had lots of help from experts local to me in setting this up.</p>
<p>If you read through the code you will find some relatively basic functionality. Your new addin has interfaces to intercept callbacks for various VS2010 events, OnConnection is the only default interface i used during which i installed new event handlers.</p>
<p>So lets extend the default a little:</p>
<ol>
<li>Add a member to the class of type &#8220;DebuggerEvents&#8221;</li>
<li>instantiate it to &#8220;_applicationObject.Events.DebuggerEvents&#8221;</li>
<li>setup handles for:</li>
<ol>
<li>OnEnterBreakMode</li>
<li>OnEnterRunMode</li>
<li>OnEnterDesignMode</li>
</ol>
<li>VS2010 should auto-complete and auto generate these for you but the basics should look like :
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>_debugger_events.OnEnterBreakMode += new _dispDebuggerEvents_OnEnterBreakModeEventHandler(DebuggerEvents_OnEnterBreakMode);
_debugger_events.OnEnterRunMode += new _dispDebuggerEvents_OnEnterRunModeEventHandler(_debugger_events_OnEnterRunMode);
_debugger_events.OnEnterDesignMode += new _dispDebuggerEvents_OnEnterDesignModeEventHandler(_debugger_events_OnEnterDesignMode);</code></pre>
<p>&nbsp;</li>
<li>For now ignore all but OnEnterBreakMode, this will be called whenever we enter any type of breakmode (stepping, breakpoints, exceptions etc). At this point we can do something quite nifty&#8230; any expression you can type into a watch window, can be evaluated here for example
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>Debugger debugger = _application_object.Debugger;
Expression exp = debugger.GetExpression("&amp;g_my_global_foo,d");

if (exp.IsValidValue)
{
&nbsp;&nbsp;&nbsp; Int64 value = Convert.ToInt64(exp.Value.ToString(), 10);
&nbsp;&nbsp;&nbsp; System.Diagnostics.Debug.WriteLine(exp.Value.ToString());

&nbsp;&nbsp;&nbsp; // do other fancy things here
}</code></pre>
</li>
<li>At this point its really up to you to choose how you communicate this information to any external process. Personally my first simple solution was old school environment variable. This works perfectly well for a limited project, simply push your symbol address to the environment variable and let any other process (EEAddin for instance) read that environment variable and use it. Should you choose to publish the root address of your global systems for instance one might conceivably achieve full global memory addressing on your target.</li>
<li>The OnEnterRunMode hook should be used to &#8220;create&#8221; any resources you need on events &#8220;Attach&#8221; and &#8220;Launch&#8221;</li>
<li>The OnEnterDesignMode hook should be used to &#8220;destroy&#8221; those resources for events &#8220;Detach, EndProgram &amp; Stop Debugging&#8221;</li>
</ol>
<p>Debugging the debugger addin isn&#8217;t difficult but is not obvious. Given the setup (Client Project) using (Debugger Addin) which launches Watch Window DLLS (EEAddin) the debugging method i used is:</p>
<ol>
<li>Open the Debugger Addin project and build/run. This should open up an instance of VS2010 with your Addin loaded (check under <i>Tools-&gt;Addin Manager</i></li>
<li>Within this second instance of VS2010 open up your Client Project, Build and run.</li>
<li>At this point breakpoints within the Debugger Addin are possible and you should be able to trap the incoming events.</li>
</ol>
<p>Notes:</p>
<ul>
<li>using environment variables will work for very simple projects however for production code i would recommend using process specific memory mapped files or another less hacky method</li>
<li>VS2008 can be supported however this has to be done as a C++/ATL project and requires a LOT of boilerplate code. Jason Weiler discusses this <a href="http://www.myopictopics.com/?p=91">here</a> and played a large part in enlightening me to this process.</li>
<li>As previous with EEAddin, this framework is not very forgiving and will require you to write defensively in all aspects. C# does handle a lot of this for you however the C++/ATL version will not. If you are writing an addin framework for a large group of programmers i would suggest significant boilerplate on all code. </li>
<li>the DTE framework is VERY powerful, anyone using it should read through the docs thoroughly to appreciate everything it can do and share what you do if you can :D</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/06/22/extending-the-watching-window-in-visual-studio-via-a-debugger-addin/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Extending the Watch Window in Visual Studio via EEAddin</title>
		<link>http://www.altdevblogaday.com/2011/06/22/extending-the-watch-window-in-visual-studio-via-eeaddin/</link>
		<comments>http://www.altdevblogaday.com/2011/06/22/extending-the-watch-window-in-visual-studio-via-eeaddin/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 02:06:50 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://altdevblogaday.org/?p=8829</guid>
		<description><![CDATA[<p>So we established that its possible to re-format what a Watch Window shows you, make things look better and therefore easier to consume. What we cannot do with that method tho is any real processing&#8230; but we can with a little more work. </p>
<p><a href="http://www.altdevblogaday.com/2011/06/22/extending-the-watch-window-in-visual-studio-via-eeaddin/" class="more-link">Read more on Extending the Watch Window in Visual Studio via EEAddin&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>So we established that its possible to re-format what a Watch Window shows you, make things look better and therefore easier to consume. What we cannot do with that method tho is any real processing&#8230; but we can with a little more work. </p>
<p>consider the example</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>// blogtest.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include &lt;stdio.h&gt;
#include &lt;intrin.h&gt;

enum e_union_test_case
{
&nbsp;&nbsp;&nbsp; k_unset,
&nbsp;&nbsp;&nbsp; k_type_string,
&nbsp;&nbsp;&nbsp; k_type_int,
&nbsp;&nbsp;&nbsp; k_type_float,
};

struct u_test_level1
{
&nbsp;&nbsp;&nbsp; int m_type;
&nbsp;&nbsp;&nbsp; union u_data
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; const char *m_string;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int m_count;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float m_value;
&nbsp;&nbsp;&nbsp; }m_data;

&nbsp;&nbsp;&nbsp; void set(const char *string)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_type= k_type_string;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_data.m_string= string;
&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp; void set(int int_value)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_type= k_type_int;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_data.m_count= int_value;
&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp; void set(float float_value)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_type= k_type_float;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_data.m_value= float_value;
&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp; void print(void)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_unset)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf("unset\n");
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_string)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf("%s \n",m_data.m_string);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_int)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf("%d \n",m_data.m_count);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_float)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf("%f \n",m_data.m_value);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp; }
};

struct s_test_level2
{
&nbsp;&nbsp;&nbsp; static const int k_num_elements= 32;

&nbsp;&nbsp;&nbsp; u_test_level1 m_elements[k_num_elements];
};

int _tmain(int argc, _TCHAR* argv[])
{
&nbsp;&nbsp;&nbsp; s_test_level2 test;

&nbsp;&nbsp;&nbsp; memset(&amp;test,0,sizeof(test));

&nbsp;&nbsp;&nbsp; test.m_elements[0].set("my foo");
&nbsp;&nbsp;&nbsp; test.m_elements[1].set(7);
&nbsp;&nbsp;&nbsp; test.m_elements[2].set(3.14159265358f);&nbsp;&nbsp;&nbsp; // everyone loves Pi

&nbsp;&nbsp;&nbsp; return 0;
}</code></pre>
<p>similar to before setup a basic console project and drop in this code. Compile and run to the breakpoint and drop &#8220;test&#8221; into your watch window opening up the m_elements member&#8230; you should see something like.</p>
<div class="separator" style="clear: both;text-align: center">
<a href="http://1.bp.blogspot.com/-sgLfjCserMw/Tdg1YV398kI/AAAAAAAAACU/Xv2vEfpqQ00/s1600/autoexp_blog3.JPG"><img border="0" src="http://1.bp.blogspot.com/-sgLfjCserMw/Tdg1YV398kI/AAAAAAAAACU/Xv2vEfpqQ00/s1600/autoexp_blog3.JPG" /></a></div>
<p>not very readable at all&#8230; the print function shows what we &#8220;might&#8221; do to display these elements by using &#8220;m_type&#8221; to change the formatting options&#8230;</p>
<p>EEAddin is an option here, (Expression Evaluation Addin). This allows the autoexp.dat [AutoExpand] section to call into a dll for the preview display string. The dll can output any string it desires into the provided char buffer (obeying the limits of course).</p>
<p>you can find an example EEAddin within the VS2010 install, assuming you have the samples extracted (most installs will have a zip file) you will find the solution at</p>
<p><span style="font-size: x-small">C:\Program Files (x86)\Microsoft Visual Studio 10.0\Samples\1033\VC2010Samples\C++\Debugging\EEaddin</span></p>
<p><span style="font-size: small">NOTE: the example does not work out of the box, you must edit &#8220;ADDIN_API&#8221; and make it</span></p>
<p><span style="font-size: small">#define ADDIN_API __declspec(dllexport)&nbsp; __stdcall</span></p>
<p><span style="font-size: small">then move the HRESULT return to the opposite side of the ADDIN_API on use. Optionally you could achieve the same results by altering the project compiler options.</span><br />
now paste the code</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>enum e_union_test_case
{
&nbsp;&nbsp;&nbsp; k_unset,
&nbsp;&nbsp;&nbsp; k_type_string,
&nbsp;&nbsp;&nbsp; k_type_int,
&nbsp;&nbsp;&nbsp; k_type_float,
};

struct u_test_level1
{
&nbsp;&nbsp;&nbsp; int m_type;
&nbsp;&nbsp;&nbsp; union u_data
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; const char *m_string;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int m_count;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float m_value;
&nbsp;&nbsp;&nbsp; }m_data;

&nbsp;&nbsp;&nbsp; void print(char *result_buffer, int result_size)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_unset)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"unset\n");
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_string)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"str:\n");
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_int)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"int:%d \n",m_data.m_count);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_float)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"flt:%f \n",m_data.m_value);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp; }
};

HRESULT ADDIN_API AddIn_blogtest( DWORD dwAddress, DEBUGHELPER *pHelper, int nBase, BOOL bUniStrings, char *pResult, size_t max, DWORD reserved )
{
&nbsp;&nbsp;&nbsp; DWORD nGot;

&nbsp;&nbsp;&nbsp; u_test_level1 example;

&nbsp;&nbsp;&nbsp; // read file time from debuggee memory space
&nbsp;&nbsp;&nbsp; if (pHelper-&gt;ReadDebuggeeMemoryEx(pHelper, pHelper-&gt;GetRealAddress(pHelper), sizeof(u_test_level1 ), &amp;example, &amp;nGot) != S_OK)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return E_FAIL;
&nbsp;&nbsp;&nbsp; if (nGot != sizeof(u_test_level1))
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return E_FAIL;

&nbsp;&nbsp;&nbsp; 

&nbsp;&nbsp;&nbsp; example.print(pResult, max);

&nbsp;&nbsp;&nbsp; return S_OK;
}</code></pre>
<p>Into the existing &#8220;timeaddin.cpp&#8221; file and add the export to both the header (timeaddin.h) and the def file (eeaddin.def).</p>
<p>Compile this and it will generate EEaddin.dll, copy this dll to your VS2010 IDE folder (C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE).</p>
<p>Add the line</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>u_test_level1=$ADDIN(eeaddin.dll,AddIn_blogtest)</code></pre>
<p>to your autoexp.dat right below the previous example, below the [AutoExpand] tag.</p>
<p>debug your application&#8230; your new display should be</p>
<div class="separator" style="clear: both;text-align: center">
<a href="http://2.bp.blogspot.com/-6k3L0yvoZxQ/TdhHaSLoKZI/AAAAAAAAACY/eawMunthZ6k/s1600/autoexp_blog4.JPG"><img border="0" src="http://2.bp.blogspot.com/-6k3L0yvoZxQ/TdhHaSLoKZI/AAAAAAAAACY/eawMunthZ6k/s1600/autoexp_blog4.JPG" /></a></div>
<p>each element now displays using the m_type correctly. We have an issue however, the string is not immediately available. This is due to the EEAddin working in a different memory space than the executable it is accessing. You&#8217;ll note that the EEaddin DLL reads memory from the application manually using the supplied address AND by dint of the autoexp.dat line we know the size of the type we&#8217;re representing. We must therefore pull any pointed at string data over manually in order to display it.</p>
<p>the new code is</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>enum e_union_test_case
{
&nbsp;&nbsp;&nbsp; k_unset,
&nbsp;&nbsp;&nbsp; k_type_string,
&nbsp;&nbsp;&nbsp; k_type_int,
&nbsp;&nbsp;&nbsp; k_type_float,
};

struct u_test_level1
{
&nbsp;&nbsp;&nbsp; int m_type;
&nbsp;&nbsp;&nbsp; union u_data
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; const char *m_string;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int m_count;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float m_value;
&nbsp;&nbsp;&nbsp; }m_data;

&nbsp;&nbsp;&nbsp; void print(char *result_buffer, int result_size)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_unset)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"unset\n");
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_string)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"str:%s\n",m_data.m_string);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_int)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"int:%d \n",m_data.m_count);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (m_type == k_type_float)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sprintf_s(result_buffer, result_size,"flt:%f \n",m_data.m_value);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp; }
};

HRESULT ADDIN_API AddIn_blogtest( DWORD dwAddress, DEBUGHELPER *pHelper, int nBase, BOOL bUniStrings, char *pResult, size_t max, DWORD reserved )
{
&nbsp;&nbsp;&nbsp; DWORD nGot;

&nbsp;&nbsp;&nbsp; u_test_level1 example;

&nbsp;&nbsp;&nbsp; // read file time from debuggee memory space
&nbsp;&nbsp;&nbsp; if (pHelper-&gt;ReadDebuggeeMemoryEx(pHelper, pHelper-&gt;GetRealAddress(pHelper), sizeof(u_test_level1 ), &amp;example, &amp;nGot) != S_OK)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return E_FAIL;
&nbsp;&nbsp;&nbsp; if (nGot != sizeof(u_test_level1))
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return E_FAIL;

&nbsp;&nbsp;&nbsp; static const int local_buffer_size= 1024;
&nbsp;&nbsp;&nbsp; char local_buffer[local_buffer_size];

&nbsp;&nbsp;&nbsp; if (example.m_type == k_type_string &amp;&amp; example.m_data.m_string != 0)
&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // pull over the string to our local buffer
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pHelper-&gt;ReadDebuggeeMemoryEx(pHelper, (DWORDLONG)example.m_data.m_string, local_buffer_size, local_buffer, &amp;nGot);

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; example.m_data.m_string= local_buffer;
&nbsp;&nbsp;&nbsp; }

&nbsp;&nbsp;&nbsp; example.print(pResult, max);

&nbsp;&nbsp;&nbsp; return S_OK;
}</code></pre>
<p>this code uses the pointer within the original &#8220;example&#8221; and pulls over an arbitrary length data stream (1kb) relying upon the string being null terminated. It then points our local copy of example.m_data.m_string to that local buffer before calling our new print(&#8230;) function which will now output the string as before. Our new display is:</p>
<div class="separator" style="clear: both;text-align: center">
<a href="http://1.bp.blogspot.com/-ojGQanJjP7U/TdhLLfDi8tI/AAAAAAAAACc/_dBEZ0fvPQc/s1600/autoexp_blog5.JPG"><img border="0" src="http://1.bp.blogspot.com/-ojGQanJjP7U/TdhLLfDi8tI/AAAAAAAAACc/_dBEZ0fvPQc/s1600/autoexp_blog5.JPG" /></a></div>
<p>which is exactly what we need.</p>
<p>Notes:</p>
<ul>
<li>EEAddin is not forgiving</li>
<ul>
<li>if you do something heinous within an EEAddin entrypoint it WILL take out the devenv that loaded it. Within Addins you MUST program defensively at all points, assume you will be sent bad data, assume your strings will not be null terminated, assume that the pointers you need to follow are bad and take you into no mans land. Verify everything. You&#8217;ll note that i did not do this in my example but that is merely for the purposes of keeping the code samples short and sweet.
</li>
</ul>
<li>Debugging the EEAddin</li>
<ul>
<li>load up the EEAddin project, Attach to the instance of devenv (debug it) that your Main project runs in, put breakpoints into your Addin entrypoint then hit debug on your application. Each time your Watch Window updates your EEAddin dll is loaded/called/unloaded so expect a lot of calls in the array case.</li>
</ul>
<li>Global Symbols are not accessible </li>
<ul>
<li>Accessing client memory requires the physical address of said client memory, if this address isn&#8217;t accessible from the element you&#8217;re debugging then you have no way to get to that address. An example of this limitation would be a handle into a global manager. The client code can call the manager to get the object, the debugger can only display the value of the handle, EEAddin can only access the internals OF the handle. It does not have knowledge of the manager and thus it cannot access the object itself.</li>
</ul>
<li>Not all Endianess was made equal</li>
<ul>
<li>if your target is not the same endianess as PC (little) such as xbox 360 then you must switch the endianess of data before reading it.</li>
</ul>
<li>Alignment &amp; Pointer size</li>
<ul>
<li>you will have to manually handle both alignment differences (target &lt;=&gt; client) and pointer size differences. In my support thus far i have chosen 2 pathways. Some structures i have made cross platform entirely by supplying a construct that is 64bit and aware of its pointer size. For other types where this isn&#8217;t possible i have used an element of the next topic to disclose to the DLL the size of our pointer.</li>
</ul>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/06/22/extending-the-watch-window-in-visual-studio-via-eeaddin/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Extending the Watch Window in Visual Studio via autoexp.dat</title>
		<link>http://www.altdevblogaday.com/2011/06/08/watch-window-part1/</link>
		<comments>http://www.altdevblogaday.com/2011/06/08/watch-window-part1/#comments</comments>
		<pubDate>Wed, 08 Jun 2011 01:20:45 +0000</pubDate>
		<dc:creator>Andy Firth</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://altdevblogaday.org/?p=7246</guid>
		<description><![CDATA[<p>Watch windows are important, they show us our data in various forms and generally enable us to debug effectively, sometimes however they need help.</p>
<p>Setup a simple windows console project and use the following code:</p>
<p><a href="http://www.altdevblogaday.com/2011/06/08/watch-window-part1/" class="more-link">Read more on Extending the Watch Window in Visual Studio via autoexp.dat&#8230;</a></p>
]]></description>
			<content:encoded><![CDATA[<p>Watch windows are important, they show us our data in various forms and generally enable us to debug effectively, sometimes however they need help.</p>
<p>Setup a simple windows console project and use the following code:</p>
<pre style="font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;color: #000000;background-color: #eee;font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px;overflow: auto;width: 100%"><code>struct s_test_level1
{
    int m_count;
    char* m_string;
    float m_value;

    void print(void)
    {
        if (m_string)
        {
            printf(&quot;%s %d %f\n&quot;,m_string, m_count, m_value);
        }
    }
};

static const int k_num_elements= 32;
struct s_test_level2
{
    s_test_level1 m_elements[k_num_elements];
};

int _tmain(int argc, _TCHAR* argv[])
{
    s_test_level2 test;

    memset(&amp;amp;test,0,sizeof(test));

    test.m_elements[0].m_string= &quot;my foo&quot;;
    test.m_elements[0].m_count= strlen(test.m_elements[0].m_string);
    test.m_elements[0].m_value=3.14159265358f; // everyone loves Pi

    test.m_elements[0].print();

    return 0; // &amp;lt;&amp;lt;= breakpoint here
}
</code></pre>
<p>now run the program putting a breakpoint on indicated line. Open up a watch window and drop test into it opening up the m_elements array, you should see something like</p>
<div class="separator" style="clear: both;text-align: center">
<a href="http://4.bp.blogspot.com/-fD57I8WDYwY/TdgldimnO9I/AAAAAAAAACM/7bisL11Jdj4/s1600/autoexp_blog.JPG"><img border="0" src="http://4.bp.blogspot.com/-fD57I8WDYwY/TdgldimnO9I/AAAAAAAAACM/7bisL11Jdj4/s1600/autoexp_blog.JPG" /></a></div>
<p>This is a really simple but even now we see some details we don&#8217;t always need&#8230; lets reformat it.</p>
<p>open up the file</p>
<p><span style="font-size: x-small">VS2010 PC: C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Packages\Debugger\autoexp.dat</span><br />
<span style="font-size: x-small">VS2008 PC: C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\Packages\Debugger\autoexp.dat </span><br />
<span style="font-size: x-small">Xbox360: C:\Program Files (x86)\Microsoft Xbox 360 SDK\bin\win32\autoexp.dat for 360</span></p>
<p>search for &#8220;[AutoExpand]&#8221; and add the line</p>
<pre style="background-color: #eeeeee;border: 1px dashed #999999;color: black;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace;font-size: 12px;line-height: 14px;overflow: auto;padding: 5px;width: 100%"><code>s_test_level1=&lt;m_count,d&gt;, &lt;m_string,s&gt;, &lt;m_value,f&gt;f</code></pre>
<p>save that file and re run the test program&#8230;. you should now see</p>
<div class="separator" style="clear: both;text-align: center">
<a href="http://4.bp.blogspot.com/-Ozlbj9ygezw/Tdgp8kkZPtI/AAAAAAAAACQ/grVtTN8vY-Y/s1600/autoexp_blog2.JPG"><img border="0" src="http://4.bp.blogspot.com/-Ozlbj9ygezw/Tdgp8kkZPtI/AAAAAAAAACQ/grVtTN8vY-Y/s1600/autoexp_blog2.JPG" /></a></div>
<p>a much easier to read version of the same data.</p>
<p>All the original information is still available should you need it but this new format is much easier to quickly consume.</p>
<p>This form of watch window help is somewhat limited however, you can only interpret existing data, remove superfluous detail, do some VERY rudimentary math and generally make things cleaner and easier to consume; see the next post for MORE.</p>
<p>All of the help required to use autoexp.dat is within the file itself. Note that the &#8220;[Visualizer]&#8221; section is currently not available for xbox 360 targets but is VERY powerful for other targets.</p>
<p>enjoy and feel free to ask questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.altdevblogaday.com/2011/06/08/watch-window-part1/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic page generated in 1.296 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2012-05-17 03:39:40 -->
<!-- Compression = gzip -->
