Pre-Grant Publication Number: 20070130448
Collaborate on the process of community review for this application. Posting will not be forwarded to the USPTO. Flagging a post as an ACTION ITEM signals further research. Flagging SPAM and ABUSE helps to manage discussion. Placing double brackets around a reference to a claim or prior art will create a hyperlink to the original ex. [[claim 1]] and [[prior art 2]].

Please review the Community Code of Conduct prior to posting

Discussion (6)
  Facilitator's Comment     Action Item
  Show without Noise
1
Marc Mengel (about 1 year ago)
Okay, so as I read this, this is a convoluted way of, in hardware, recognizing that you're planning to write a result to memory, and then readi it back from memory, and instead reading the result from the register and/or from the output from the execution unit rather than waiting for the memory write/read sequence.

So my question is, other than being done at runtime, how is this different from what peephole optimizers have been doing forever?
Elizabeth Haring (about 1 year ago)
Could you provide an excerpt along with your prior art submission since the url to which the prior art points to a pdf file that may only be accessed by ACM Web account holders?
Elizabeth Haring (about 1 year ago)
My understanding is that peephole optimization is utilized in compiler optimization techniques?
Marc Mengel (about 1 year ago)
Oh! I figured the Patent Office would have susbscriptions to ACM online, and IEEE, to look for prior art, etc.

The online article is a scan, not converted to text, so I'll hand type some snippets...

"Redundant instructions may be discarded during the final stage of compilation by using asimple optimizing technique called peephole optimization. The method is described and examples are given.
...
Example 1. Source code:
x := y;
z = x + z;
Complied Code:
LDA Y -- load the accuulatro from y
STA X -- store the accumulator in X
LDA X -- load the accumulator from X
ADD Z -- add teh contents of Z
STA Z -- store teh accumulator in Z
If the store instruction is nondesctructuve, the third instruction is reduncant and may be discarded.
...
[he descusses a case where the load and store are would not, originally, be adjacent]
but straightforward translation leads to less than optimum code. When teh optimizing code imitter produces a commutative operator (addition, multiplication...) it must first check the preceding instruction, If that instruction was LDA, then the optimizer may chosse, if the instruction preceding the LDA was STA to reorder the commutatitve operation so as to avoid the LDA, as in Example 1. Having avoided the LDA, the optimizer may check to see if the SATA was in fact a store to a temporary location in which case the STA may also be discarded."

And the conclusion:

" These tehcniques have been used in GOGOL , a translator written byt eh authro for the PDP-1 time sharing system at Stanford. The optimizing code emitting routine consistes fo about 400 or the 2000 instructions required for the compiler. We have found that the code emitter can see, throug a very narrow peephole, enough to make considerable improvement in the object code. The limitation on how well the optimizer will work seems to depend primarily on how much time and space are available for recognizing redunant sequences of instructions."
Elizabeth Haring (about 1 year ago)
Another area to explore is to look at methods for improving upon Memory Renaming as mentioned in this patent application, but feel free to submit more examples of peephole optimization and how it compares to Memory Renaming techniques. Also, please take a look at the latest prior art that was submitted.
Marc Mengel (about 1 year ago)
What he is doing in "memory renaming" is changing the operand source of an instruction -- changing it from a memory input to a register input (possibly an intermediate register). This is also something peephole optimizers have been doing forever. I know I did it in my compiler design class at Purdue in 1984, as part of a class project. So as I mentioned, the only novel thing going on here is doing these optimizations in a CPU. Not that you might not consider that novel, but it seems a whole lot less novel when you realize its been done elsewhere (i.e in compiler backends) for decades. Or maybe the circuitry he designed to do it is novel. But the overall scheme is nothing new.