将目标地址保持在寄存器中,直到指令退出

2020-01-12 原文

我想使用精确的基于事件的采样(PEBS)在XeonE5 Sandy Bridge上记录特定事件的所有地址(例如高速缓存未命中).

但是,CoreTM i7处理器和英特尔®至强™5500处理器的性能分析指南(第24页)包含以下警告：

As the PEBS mechanism captures the values of the register at
completion of the instruction,the dereferenced address for the
following type of load instruction (Intel asm convention) cannot be
reconstructed.
MOV RAX,[RAX+const]
This kind of
instruction is mostly associated with pointer chasing
mystruc = mystruc->next;
This is a significant shortcoming of this
approach to capturing memory instruction addresses.

我根据objdump在我的程序中有一些加载说明.有什么办法可以避免吗？

由于这是一个特定于英特尔的问题,解决方案无须以任何方式进行移植,只需要运行.我的代码是用C编写的,我理想地寻找一个编译器级别的解决方案(gcc或icc),但是任何建议都是可以接受的.

一些例子：

mov    0x18(%rdi),%rdi

mov    (%rcx,%rax,8),%rax

在这两种情况下,在指令退出后(因此当我查看寄存器值以确定我加载到/从哪里)地址的值(这些示例中的％rdi 18和％rcx 8 *％rax)被覆盖由mov的结果.

解决方法

您想要做的是转换表单的所有说明：

mov    (%rcx,%rax

成：

mov    (%rcx,%r11
mov    %r11,%rax

这可以通过修改编译器生成的汇编器源来更容易地完成.下面是一个Perl脚本,它将通过读取和修改.s文件来完成所有必要的转换.

只需更改生成.s文件而不是.o文件,应用脚本,然后使用as或gcc生成.o

这是实际的脚本.我已经在几个我自己的来源测试了,按照下面的评论中的构建过程.

该脚本具有以下功能：

>扫描并定位所有函数定义
>标识给定功能中使用的所有寄存器
>定位函数的所有返回点
>根据功能的寄存器使用情况选择要使用的临时寄存器(即将使用尚未由功能使用的临时寄存器)
>用两个指令序列代替所有“麻烦”指令
>尝试使用未使用的临时寄存器(例如％r11或未使用的参数寄存器),然后尝试使用被调用方保存的寄存器
>如果选择的注册表被保留,则会添加push到函数prolog并弹出功能[multiple] ret语句
>维护所有分析和转换的日志,并将其作为注释追加到输出.s文件

#!/usr/bin/perl
# pebsfix/pebsfixup -- fix assembler source for PEBS usage
#
# command line options:
#   "-a" -- use only full 64 bit targets
#   "-l" -- do _not_ use lea
#   "-D[diff-file]" -- show differences (default output: "./DIFF")
#   "-n10" -- do _not_ use register %r10 for temporary (default is use it)
#   "-o" -- overwrite input files (can be multiple)
#   "-O<outfile>" -- output file (only one .s input allowed)
#   "-q" -- suppress warnings
#   "-T[lvl]" -- debug trace
#
# "-o" and "-O" are mutually exclusive
#
# command line script test options:
#   "-N[TPA]" -- disable temp register types [for testing]
#   "-P" -- force push/pop on all functions
#
# command line arguments:
#   1-- list of .s files to process [or directory to search]
#       for a given file "foo.s",output is to "foo.TMP"
#       if (-o is given,"foo.TMP" is renamed to "foo.s")
#
# suggested usage:
#   change build to produce .s files
#   FROM:
#     cc [options] -c foo.c
#   TO:
#     cc [options] -c -S foo.c
#     pebsfixup -o foo.s
#     cc -c foo.s
#
# suggested compiler options:
# [probably only really needed if push/pop required. use -NP to verify]
#   (1) use either of
#       -O2 -fno-optimize-sibling-calls
#       -O1
#   (2) use -mno-omit-leaf-frame-pointer
#   (3) use -mno-red-zone [probably not required in any case]
#
# NOTES:
#   (1) red zones are only really useful for leaf functions (i.e. if fncA calls
#       fncB,fncA's red zone would be clobbered)
#   (2) pushing onto the stack isn't a problem if there is a formal stack frame
#   (3) the push is okay if the function has no more than six arguments (i.e.
#       does _not_ use positive offsets from %rsp to access them)

#pragma pgmlns
use strict qw(vars subs);

our $pgmtail;

our $opt_a;
our $opt_T;
our $opt_D;
our $opt_l;
our $opt_n10;
our $opt_N;
our $opt_P;
our $opt_q;
our $opt_o;
our $opt_O;
our $opt_s;

our @reguse;
our %reguse_tobase;
our %reguse_isbase;
our $regusergx;

our @regtmplist;
our %regtmp_type;

our $diff;

our $sepflg;
our $fatal;
our @cmtprt;

master(@ARGV);
exit(0);

# master -- master control
sub master
{
    my(@argv) = @_;
    my($xfsrc);
    my($file,@files);
    my($bf);

    $pgmtail = "pebsfixup";

    optget(\@argv);

    # define all kNown/usable registers
    regusejoin();

    # define all registers that we may use as a temporary
    regtmpall();

    if (defined($opt_D)) {
        unlink($opt_D);
    }

    # show usage
    if (@argv <= 0) {
        $file = $0;
        open($xfsrc,"<$file") ||
            sysfault("$pgmtail: unable to open '%s' -- $!\n",$file);

        while ($bf = <$xfsrc>) {
            chomp($bf);
            next if ($bf =~ /^#!/);
            last unless ($bf =~ s/^#//);
            $bf =~ s/^# ?//;
            print($bf,"\n");
        }

        close($xfsrc);
        exit(1);
    }

    foreach $file (@argv) {
        if (-d $file) {
            dodir(\@files,$file);
        }
        else {
            push(@files,$file);
        }
    }

    if (defined($opt_O)) {
        sysfault("$pgmtail: -O may have only one input file\n")
            if (@files != 1);
        sysfault("$pgmtail: -O and -o are mutually exclusive\n")
            if ($opt_o);
    }

    foreach $file (@files) {
        dofile($file);
    }

    if (defined($opt_D)) {
        exec("less",$opt_D);
    }
}

# dodir -- process directory
sub dodir
{
    my($files,$dir) = @_;
    my($file,@files);

    @files = (`find $dir -type f -name '*.s'`);
    foreach $file (@files) {
        chomp($file);
        push(@$files,$file);
    }
}

# dofile -- process file
sub dofile
{
    my($file) = @_;
    my($ofile);
    my($xfsrc);
    my($xfdst);
    my($bf,$lno,$outoff);
    my($fixoff);
    my($lhs,$rhs);
    my($xop,$arg);
    my($ix);
    my($sym,$val,$typ);
    my(%sym_type);
    my($fnc,$fnx,%fnx_lookup,@fnxlist);
    my($retlist);
    my($uselook,@uselist,%avail);
    my($fixreg,$fixrtyp);
    my($sixlist);
    my($fix,$fixlist);
    my($fixtot);
    my(@fix);
    my(@outlist);
    my($relaxflg);
    my($cmtchr);

    undef($fatal);
    undef(@cmtprt);

    msgprt("\n")
        if ($sepflg);
    $sepflg = 1;
    msgprt("$pgmtail: processing %s ...\n",$file);

    $cmtchr = "#";

    cmtprt("%s\n","-" x 78);
    cmtprt("FILE: %s\n",$file);

    # get the output file
    $ofile = $file;
    sysfault("$pgmtail: bad suffix -- file='%s'\n",$file)
        unless ($ofile =~ s/[.]s$//);
    $ofile .= ".TMP";

    # use explicit output file
    if (defined($opt_O)) {
        $ofile = $opt_O;
        sysfault("$pgmtail: output file may not be input file -- use -o instead\n")
            if ($ofile eq $file);
    }

    open($xfsrc,"<$file") ||
        sysfault("$pgmtail: unable to open '%s' -- $!\n",$file);

    $lno = 0;
    while ($bf = <$xfsrc>) {
        chomp($bf);
        $bf =~ s/\s+$//;

        $outoff = $lno;
        ++$lno;

        push(@outlist,$bf);

        # clang adds comments
        $ix = index($bf,"#");
        if ($ix >= 0) {
            $bf = substr($bf,$ix);
            $bf =~ s/\s+$//;
        }

        # look for ".type blah,@function"
        # NOTE: this always comes before the actual label line [we hope ;-)]
        if ($bf =~ /^\s+[.]type\s+([^,]+),\s*(\S+)/) {
            ($sym,$val) = ($1,$2);
            $val =~ s/^\@//;
            $sym_type{$sym} = $val;
            cmtprt("\n");
            cmtprt("TYPE: %s --> %s\n",$sym,$val);
            next;
        }

        # look for "label:"
        if ($bf =~ /^([a-z_A-Z][a-z_A-Z0-9]*):$/) {
            $sym = $1;
            next if ($sym_type{$sym} ne "function");

            $fnc = $sym;
            cmtprt("FUNCTION: %s\n",$fnc);

            $fnx = {};
            $fnx_lookup{$sym} = $fnx;
            push(@fnxlist,$fnx);

            $fnx->{fnx_fnc} = $fnc;
            $fnx->{fnx_outoff} = $outoff;

            $uselook = {};
            $fnx->{fnx_used} = $uselook;

            $retlist = [];
            $fnx->{fnx_retlist} = $retlist;

            $fixlist = [];
            $fnx->{fnx_fixlist} = $fixlist;

            $sixlist = [];
            $fnx->{fnx_sixlist} = $sixlist;
            next;
        }

        # remember all registers used by function:
        while ($bf =~ /($regusergx)/gpo) {
            $sym = ${^MATCH};
            $val = $reguse_tobase{$sym};
            dbgprt(3,"dofile: REGUSE sym='%s' val='%s'\n",$val);

            $uselook->{$sym} += 1;

            $uselook->{$val} += 1
                if ($val ne $sym);
        }

        # handle returns
        if ($bf =~ /^\s+ret/) {
            push(@$retlist,$outoff);
            next;
        }
        if ($bf =~ /^\s+rep[a-z]*\s+ret/) {
            push(@$retlist,$outoff);
            next;
        }

        # split up "movq 16(%rax),%rax" ...
        $ix = rindex($bf,",");
        next if ($ix < 0);

        # ... into "movq 16(%rax)"
        $lhs = substr($bf,$ix);
        $lhs =~ s/\s+$//;

        # check for "movq 16(%rsp)" -- this means that the function has/uses
        # more than six arguments (i.e. we may _not_ push/pop because it
        # wreaks havoc with positive offsets)
        # FIXME/CAE -- we'd have to adjust them by 8 which we don't do
        (undef,$rhs) = split(" ",$lhs);
        if ($rhs =~ /^(\d+)[(]%rsp[)]$/) {
            push(@$sixlist,$outoff);
            cmtprt("SIXARG: %s (line %d)\n",$rhs,$lno);
        }

        # ... and "%rax"
        $rhs = substr($bf,$ix + 1);
        $rhs =~ s/^\s+//;

        # target must be a [simple] register [or source scan will blow up]
        # (e.g. we actually had "cmp %ebp,(%rax,%r14)")
        next if ($rhs =~ /[)]/);

        # ensure we have the "%" prefix
        next unless ($rhs =~ /^%/);

        # we only want the full 64 bit reg as target
        # (e.g. "mov (%rbx),%al" doesn't count)
        $val = $reguse_tobase{$rhs};
        if ($opt_a) {
            next if ($val ne $rhs);
        }
        else {
            next unless (defined($val));
        }

        # source operand must contain target [base] register
        next unless ($lhs =~ /$val/);
        ###cmtprt("1: %s,%s\n",$lhs,$rhs);

        # source operand must be of the "right" type
        # FIXME/CAE -- we may need to revise this
        next unless ($lhs =~ /[(]/);
        cmtprt("NEEDFIX: %s,%s (line %d)\n",$lno);

        # remember the place we need to fix for later
        $fix = {};
        push(@$fixlist,$fix);
        $fix->{fix_outoff} = $outoff;
        $fix->{fix_lhs} = $lhs;
        $fix->{fix_rhs} = $rhs;
    }

    close($xfsrc);

    # get total number of fixups
    foreach $fnx (@fnxlist) {
        $fixlist = $fnx->{fnx_fixlist};
        $fixtot += @$fixlist;
    }
    msgprt("$pgmtail: needs %d fixups\n",$fixtot)
        if ($fixtot > 0);

    # fix each function
    foreach $fnx (@fnxlist) {
        cmtprt("\n");
        cmtprt("FNC: %s\n",$fnx->{fnx_fnc});

        $fixlist = $fnx->{fnx_fixlist};

        # get the fixup register
        ($fixreg,$fixrtyp) = regtmploc($fnx,$fixlist);

        # show number of return points
        {
            $retlist = $fnx->{fnx_retlist};
            cmtprt("  RET: %d\n",scalar(@$retlist));
            last if (@$retlist >= 1);

            # NOTE: we display this warning because we may not be able to
            # handle all situations

            $relaxflg = (@$fixlist <= 0) || ($fixrtyp ne "P");
            last if ($relaxflg && $opt_q);

            errprt("$pgmtail: in file '%s'\n",$file);
            errprt("$pgmtail: function '%s' has no return points\n",$fnx->{fnx_fnc});
            errprt("$pgmtail: suggest recompile with correct options\n");

            if (@$fixlist <= 0) {
                errprt("$pgmtail: working around because function needs no fixups\n");
                last;
            }

            if ($fixrtyp ne "P") {
                errprt("$pgmtail: working around because fixup reg does not need to be saved\n");
                last;
            }
        }

        # show stats on register usage in function
        $uselook = $fnx->{fnx_used};
        @uselist = sort(keys(%$uselook));
        cmtprt("  USED:\n");
        %avail = %reguse_isbase;
        foreach $sym (@uselist) {
            $val = $uselook->{$sym};

            $typ = $regtmp_type{$sym};
            $typ = sprintf(" (TYPE: %s)",$typ)
                if (defined($typ));

            cmtprt("    %s used %d%s\n",$typ);
            $val = $reguse_tobase{$sym};
            delete($avail{$val});
        }

        # show function's available [unused] registers
        @uselist = keys(%avail);
        @uselist = sort(regusesort @uselist);
        if (@uselist > 0) {
            cmtprt("  AVAIL:\n");
            foreach $sym (@uselist) {
                $typ = $regtmp_type{$sym};
                $typ = sprintf(" (TYPE: %s)",$typ)
                    if (defined($typ));
                cmtprt("    %s%s\n",$typ);
            }
        }

        # skip over any functions that don't need fixing _and_ have a temp
        # register
        if (@$fixlist <= 0 && (! $opt_P)) {
            next if (defined($fixreg));
        }

        msgprt("$pgmtail: function %s\n",$fnx->{fnx_fnc});

        # skip function because we don't have a fixup register but report it
        # here
        unless (defined($fixreg)) {
            $bf = (@$fixlist > 0) ? "FATAL" : "can be ignored -- no fixups needed";
            msgprt("$pgmtail: FIXnorEG (%s)\n",$bf);
            cmtprt("  FIXnorEG (%s)\n",$bf);
            next;
        }

        msgprt("$pgmtail: FIXREG --> %s (TYPE: %s)\n",$fixreg,$fixrtyp);
        cmtprt("  FIXREG --> %s (TYPE: %s)\n",$fixrtyp);

        foreach $fix (@$fixlist) {
            $outoff = $fix->{fix_outoff};

            undef(@fix);
            cmtprt("  FIXOLD %s\n",$outlist[$outoff]);

            # original
            if ($opt_l) {
                $bf = sprintf("%s,%s",$fix->{fix_lhs},$fixreg);
                push(@fix,$bf);
                $bf = sprintf("\tmov\t%s,$fix->{fix_rhs});
                push(@fix,$bf);
            }

            # use lea
            else {
                ($xop,$arg) = split(" ",$fix->{fix_lhs});
                $bf = sprintf("\tlea\t\t%s,$arg,$bf);
                $bf = sprintf("\t%s\t(%s),$xop,$bf);
            }

            foreach $bf (@fix) {
                cmtprt("  FIXNEW %s\n",$bf);
            }

            $outlist[$outoff] = [@fix];
        }

        unless ($opt_P) {
            next if ($fixrtyp ne "P");
        }

        # fix the function prolog
        $outoff = $fnx->{fnx_outoff};
        $lhs = $outlist[$outoff];
        $rhs = sprintf("\tpush\t%s",$fixreg);
        $bf = [$lhs,""];
        $outlist[$outoff] = $bf;

        # fix the function return points
        $retlist = $fnx->{fnx_retlist};
        foreach $outoff (@$retlist) {
            $rhs = $outlist[$outoff];
            $lhs = sprintf("\tpop\t%s",$fixreg);
            $bf = ["",$rhs];
            $outlist[$outoff] = $bf;
        }
    }

    open($xfdst,">$ofile") ||
        sysfault("$pgmtail: unable to open '%s' -- $!\n",$ofile);

    # output all the assembler text
    foreach $bf (@outlist) {
        # ordinary line
        unless (ref($bf)) {
            print($xfdst $bf,"\n");
            next;
        }

        # apply a fixup
        foreach $rhs (@$bf) {
            print($xfdst $rhs,"\n");
        }
    }

    # output all our reasoning as comments at the bottom
    foreach $bf (@cmtprt) {
        if ($bf eq "") {
            print($xfdst $cmtchr,$bf,"\n");
        }
        else {
            print($xfdst $cmtchr," ","\n");
        }
    }

    close($xfdst);

    # get difference
    if (defined($opt_D)) {
        system("diff -u $file $ofile >> $opt_D");
    }

    # install fixed/modified file
    {
        last unless ($opt_o || defined($opt_O));
        last if ($fatal);
        msgprt("$pgmtail: installing ...\n");
        rename($ofile,$file);
    }
}

# regtmpall -- define all temporary register candidates
sub regtmpall
{

    dbgprt(1,"regtmpall: ENTER\n");

    regtmpdef("%r11","T");

    # NOTES:
    # (1) see notes on %r10 in ABI at bottom -- should we use it?
    # (2) a web search on "shared chain" and "x86" only produces 28 results
    # (3) some gcc code uses it as an ordinary register
    # (4) so,use it unless told not to
    regtmpdef("%r10","T")
        unless ($opt_n10);

    # argument registers (a6-a1)
    regtmpdef("%r9","A6");
    regtmpdef("%r8","A5");
    regtmpdef("%rcx","A4");
    regtmpdef("%rdx","A3");
    regtmpdef("%rsi","A2");
    regtmpdef("%rdi","A1");

    # callee preserved registers
    regtmpdef("%r15","P");
    regtmpdef("%r14","P");
    regtmpdef("%r13","P");
    regtmpdef("%r12","P");

    dbgprt(1,"regtmpall: EXIT\n");
}

# regtmpdef -- define usable temp registers
sub regtmpdef
{
    my($sym,$typ) = @_;

    dbgprt(1,"regtmpdef: SYM sym='%s' typ='%s'\n",$typ);

    push(@regtmplist,$sym);
    $regtmp_type{$sym} = $typ;
}

# regtmploc -- locate temp register to fix problem
sub regtmploc
{
    my($fnx,$fixlist) = @_;
    my($sixlist);
    my($uselook);
    my($regrhs);
    my($fixcnt);
    my($coretyp);
    my($reglhs,$regtyp);

    dbgprt(2,"regtmploc: ENTER fnx_fnc='%s'\n",$fnx->{fnx_fnc});

    $sixlist = $fnx->{fnx_sixlist};
    $fixcnt = @$fixlist;
    $fixcnt = 1
        if ($opt_P);

    $uselook = $fnx->{fnx_used};

    foreach $regrhs (@regtmplist) {
        dbgprt(2,"regtmploc: TRYREG regrhs='%s' uselook=%d\n",$regrhs,$uselook->{$regrhs});

        unless ($uselook->{$regrhs}) {
            $regtyp = $regtmp_type{$regrhs};
            $coretyp = $regtyp;
            $coretyp =~ s/\d+$//;

            # function uses stack arguments -- we can't push/pop
            if (($coretyp eq "P") && (@$sixlist > 0)) {
                dbgprt(2,"regtmploc: SIXREJ\n");
                next;
            }

            if (defined($opt_N)) {
                dbgprt(2,"regtmploc: TRYREJ opt_N='%s' regtyp='%s'\n",$opt_N,$regtyp);
                next if ($opt_N =~ /$coretyp/);
            }

            $reglhs = $regrhs;
            last;
        }
    }

    {
        last if (defined($reglhs));

        errprt("regtmploc: unable to locate usable fixup register for function '%s'\n",$fnx->{fnx_fnc});

        last if ($fixcnt <= 0);

        $fatal = 1;
    }

    dbgprt(2,"regtmploc: EXIT reglhs='%s' regtyp='%s'\n",$reglhs,$regtyp);

    ($reglhs,$regtyp);
}

# regusejoin -- get regex for all registers
sub regusejoin
{
    my($reg);

    dbgprt(1,"regusejoin: ENTER\n");

    # rax
    foreach $reg (qw(a b c d))  {
        regusedef($reg,"r_x","e_x","_l","_h");
    }

    #   rdi/rsi
    foreach $reg (qw(d s)) {
        regusedef($reg,"r_i","e_i","_i","_il");
    }

    # rsp/rbp
    foreach $reg (qw(b s)) {
        regusedef($reg,"r_p","e_p");
    }

    foreach $reg (8,9,10,11,12,13,14,15) {
        regusedef($reg,"r_","r_d","r_w","r_b");
    }

    $regusergx = join("|",reverse(sort(@reguse)));

    dbgprt(1,"regusejoin: EXIT regusergx='%s'\n",$regusergx);
}

# regusedef -- define all registers
sub regusedef
{
    my(@argv) = @_;
    my($mid);
    my($pat);
    my($base);

    $mid = shift(@argv);

    dbgprt(1,"regusedef: ENTER mid='%s'\n",$mid);

    foreach $pat (@argv) {
        $pat = "%" . $pat;
        $pat =~ s/_/$mid/;
        $base //= $pat;
        dbgprt(1,"regusedef: PAT pat='%s' base='%s'\n",$pat,$base);

        push(@reguse,$pat);
        $reguse_tobase{$pat} = $base;
    }

    $reguse_isbase{$base} = 1;

    dbgprt(1,"regusedef: EXIT\n");
}

# regusesort -- sort base register names
sub regusesort
{
    my($symlhs,$numlhs);
    my($symrhs,$numrhs);
    my($cmpflg);

    {
        ($symlhs,$numlhs) = _regusesort($a);
        ($symrhs,$numrhs) = _regusesort($b);

        $cmpflg = $symlhs cmp $symrhs;
        last if ($cmpflg);

        $cmpflg = $numlhs <=> $numrhs;
    }

    $cmpflg;
}

# _regusesort -- split up base register name
sub _regusesort
{
    my($sym) = @_;
    my($num);

    if ($sym =~ s/(\d+)$//) {
        $num = $1;
        $num += 0;
        $sym =~ s/[^%]/z/g;
    }

    ($sym,$num);
}

# optget -- get options
sub optget
{
    my($argv) = @_;
    my($bf);
    my($sym,$val);
    my($dft,%dft);

    foreach $sym (qw(a l n10 P q o s T)) {
        $dft{$sym} = 1;
    }
    $dft{"N"} = "T";
    $dft{"D"} = "DIFF";

    while (1) {
        $bf = $argv->[0];
        $sym = $bf;

        last unless ($sym =~ s/^-//);
        last if ($sym eq "-");

        shift(@$argv);

        {
            if ($sym =~ /([^=]+)=(.+)$/) {
                ($sym,$2);
                last;
            }

            if ($sym =~ /^(.)(.+)$/) {
                ($sym,$2);
                last;
            }

            undef($val);
        }

        $dft = $dft{$sym};
        sysfault("$pgmtail: unkNown option -- '%s'\n",$bf)
            unless (defined($dft));

        $val //= $dft;

        ${"opt_" . $sym} = $val;
    }
}

# cmtprt -- transformation comments
sub cmtprt
{

    $_ = shift(@_);
    $_ = sprintf($_,@_);
    chomp($_);
    push(@cmtprt,$_);
}

# msgprt -- progress output
sub msgprt
{

    printf(STDERR @_);
}

# errprt -- show errors
sub errprt
{

    cmtprt(@_);
    printf(STDERR @_);
}

# sysfault -- abort on error
sub sysfault
{

    printf(STDERR @_);
    exit(1);
}

# dbgprt -- debug print
sub dbgprt
{

    $_ = shift(@_);
    goto &_dbgprt
        if ($opt_T >= $_);
}

# _dbgprt -- debug print
sub _dbgprt
{

    printf(STDERR @_);
}

更新：

我已经更新了脚本来修复错误,添加更多的检查和更多的选项.注意：我不得不删除ABI在底部以适应30,000限制.

Otherwise weird results appear on other commands with parentheses for example cmpl %ebp,%r14) splits into lhs='cmpl %ebp,(%rax' and rhs='%r14)' which in turn causes /$rhs/ to fail.

是的,那是一个bug.固定.

Your $rhs =~ /%[er](.x|\d+)/ doesn’t match byte or word loads to di,or ax. That’s unlikely,though. Oh,also,I think it fails to match rdi / rsi. so you don’t need the trailing d in r10d

固定.查找所有变体.

Wow,I assumed something like this would have to happen at compile time,and that doing it after the fact would be too messy.

无耻的插件：感谢“哇！”. perl对于像这样的凌乱的工作是非常好的.我以前写过这样的汇编器“注入”脚本. (例如)返回到[在编译器支持之前]添加分析调用的那一天.

You Could mark %r10 as another call-preserved register.

经过几次网页搜索,我只能在“静态链”x86上找到大约84场比赛.唯一的一个相关性是x86 ABI.而且,它不提供任何解释,除了提及它作为脚注.另外,一些gcc代码使用r10,没有任何保存作为被调用者注册.所以,我现在默认程序使用r10(如果需要,使用命令行选项来禁用它).

What happens if a function already uses all the registers?

如果真的是这样,那我们就不幸了.该脚本将检测并报告,并且如果找不到备用寄存器,则禁止fixup.

并且,它将使用“被调用者必须保留”注册,通过注入一个推送作为第一功能和相应的弹出就在ret inst [可以有多个]之前.这可以通过选项禁用.

You can’t just push/pop,because that steps on the red-zone

不,不是的.原因如下：

(1)几乎作为附注：红色区域仅在叶子功能中有用.否则,如果fncA调用fncB,则fncA执行此操作的唯一方法就是自己的红色区域.请参阅脚本的顶部注释块中的编译选项.

(2)更重要的是,由于push / pop的注入方式.推动发生在任何其他行为之前.流行音乐发生在任何其他动作之后[在ret之前].

红色区域仍然存在 – 完好无损.它只是偏离了-8,否则将被拒绝.所有红区活动都将被保留,因为这些insts使用来自％rsp的负偏移

这不同于内联asm块中的push / pop.通常的情况是红色区域代码(例如)mov $23,-4(％rsp).随后进行推送/弹出的内嵌asm块将会顺利进行.

一些功能显示：

# function_original -- original function before pebsfixup
# RETURNS: 23
function_original:
    mov     $23,-4(%rsp)                # red zone code generated by compiler
    ...
    mov     -4(%rsp),%rax               # will still have $23
    ret

# function_pebsfixup -- pebsfixup modified
# RETURNS: 23
function_pebsfixup:
    push    %r12                        # pebsfixup injected

    mov     $23,%rax               # will still have $23

    pop     %r12                        # pebsfixup injected
    ret

# function_inline -- function with inline asm block and red zone
# RETURNS: unkNown value
function_inline:
    mov     $23,-4(%rsp)                # red zone code generated by compiler

    # inline asm block -- steps on red zone
    push    %rdx
    push    %rcx
    ...
    pop     %rcx
    pop     %rdx

    ...

    mov     -4(%rsp),%rax               # Now -4(%rsp) no longer has $23

    ret

push / pop确实让我们陷入麻烦的是如果函数使用六个以上的参数(即args 7在堆栈上).访问这些参数使用％rsp的正偏移：

mov     32(%rsp),%rax

通过我们的“窍门”按钮,偏移量将不正确.现在正确的偏移量将高8：

mov     40(%rsp),%rax

脚本会检测到这个并抱怨.但是,由于这种情况是低概率,它还没有调整正偏移量.它可能需要大约五行代码来解决这个问题.现在打乒乓球

将目标地址保持在寄存器中,直到指令退出的更多相关文章

汇编 – “LES”8086指令未按预期工作

当我调试编译的exe时,我注意到ES和DI寄存器没有加载正确的值.在加载段和从RAM偏移之前,需要将DS寄存器设置为实际指向数据段.默认情况下,DS指向您的PSP,而PSP不是您希望它指向的位置.
delphi – 某些CPU在紧循环中的ADC / SBB和INC / DEC存在问题

我在Delphi中编写一个简单的BigInteger类型.它主要由TLimb的动态数组组成,TLimb是一个32位无符号整数,32位大小的字段也保存BigInteger的符号位.要添加两个BigInteger,我创建一个适当大小的新BigInteger,然后在一些记账后,调用以下过程,将三个指针传递给左右操作数和结果的数组的各个开始,以及分别为左右肢数.普通代码：这个代码运行良好,我非常满意,直到
Delphi中的COM方法偏移

在Delphi中,如何找到COM方法的地址？解决方法您可以使用vmtoffset汇编程序指令获取接口方法相对于接口方法表开头的字节偏移量.看一下System.pas中_IntfCast的实现,例如：第一个表达式加0;第二,8.但是,您无法对这些表达式进行参数化.它们是编译时常量,因此您无法在运行时选择所需的方法.您需要提前表示所有可能的方法名称.你真正需要挂钩的是QueryInterface.完成后,您可以返回所需的任何代理对象,可以拦截对所有其他方法的调用.
java – CPU的div指令和HotSpot的JIT代码之间的性能差距很大

想到的一个解释是假设存在一个除法算法,该算法首次涉及红利的进程.然后,JIT编译器将有一个头开始,因为它将评估在编译时仅涉及除数的第一部分,并仅将算法的第二部分作为运行时代码发出.那个假设是否正确？
将目标地址保持在寄存器中,直到指令退出

我想使用精确的基于事件的采样(PEBS)在XeonE5SandyBridge上记录特定事件的所有地址(例如高速缓存未命中).但是,CoreTMi7处理器和英特尔至强5500处理器的性能分析指南(第24页)包含以下警告：AsthePEBSmechanismcapturesthevaluesoftheregisteratcompletionoftheinstruction,thedereference
优化此C(AVR)代码

我有一个中断处理程序,只是没有足够快的运行我想做的事情.基本上,我正在使用它来产生正弦波,通过从AVR微控制器的一个查询表中输出一个值,但不幸的是,这并不足以让我得到我想要的波的频率.我被告知,我应该看看在汇编中实现它,因为编译器生成的程序集可能会稍微低效,可能可以被优化,但是在查看汇编代码后,我真的看不到我能做得更好.这是C代码：振幅和numOfAmps都被另一个中断程序改变,运行速度比这个慢一
c# – T4在网站项目中具有相对路径的汇编指令？

我在VisualStudio中有一个网站项目,我正在尝试从站点的bin目录中引用一些程序集.到目前为止,根路径是唯一有效的路径：其他人提到使用msbuild变量,但这对我不起作用：我很确定相对路径只是平坦无法工作：如果不使用root路径,有没有办法让这个工作在网站项目的上下文中？解决方法T4模板中的装配参考需要GAC或绝对路径.但是,您可以使用已知路径中的相对路径：例如：
x86_64上无用的jp / jnp汇编指令

我试图找出jp/jnp指令在LLVM生成的C代码中的用途.样品：装配输出：jne正在检查值！=1.5并跳过赋值,但jp在这种情况下做了什么？
c – 转置2D阵列

你如何有效地转置矩阵？有没有这个库,或者你会使用什么算法？现在,对于32位寄存器中的8位字节,ARM没有完全随机指令,但您可以使用移位和SEL(选择)指令来合成所需的内容,并且可以在一个指令中进行第二组混洗.使用PKHBT和PKHTB指令进行指导.最后,如果您正在使用具有NEON矢量化的大型ARM处理器,则可以使用16×16块上的16个元素向量执行此类操作.
c – 如何获取gdb tui程序集输出以显示指令？

我想看到程序集输出,但发现在TUI中,它会输出函数签名偏移量：这很酷,除了我用C编程并且函数签名完全解析的事实,所以我得到名称空间和模板参数使函数sig2或更多行.这当然会在TUI中被截断,因此它甚至不会显示汇编指令.有没有办法缩短,更改前缀或根本不输出此前缀到汇编指令？

随机推荐

从C到C#的zlib(如何将byte []转换为流并将流转换为byte [])

我的任务是使用zlib解压缩数据包(已接收),然后使用算法从数据中生成图片好消息是我在C中有代码,但任务是在C#中完成C我正在尝试使用zlib.NET,但所有演示都有该代码进行解压缩(C#)我的问题：我不想在解压缩后保存文件,因为我必须使用C代码中显示的算法.如何将byte[]数组转换为类似于C#zlib代码中的流来解压缩数据然后如何将流转换回字节数组？
为什么C标准使用不确定的变量未定义？

垃圾价值存储在哪里,为什么目的？解决方法由于效率原因,C选择不将变量初始化为某些自动值.为了初始化这些数据,必须添加指令.以下是一个例子：产生：虽然这段代码：产生：你可以看到,一个完整的额外的指令用来移动1到x.这对于嵌入式系统来说至关重要.
如何使用命名管道从c调用WCF方法？

更新：通过协议here,我无法弄清楚未知的信封记录.我在网上找不到任何例子.原版的：我有以下WCF服务我输出添加5行,所以我知道服务器是否处理了请求与否.我有一个.NET客户端,我曾经测试这一切,一切正常工作预期.现在我想为这个做一个非托管的C客户端.我想出了如何得到管道的名称,并写信给它.我从here下载了协议我可以写信给管道,但我看不懂.每当我尝试读取它,我得到一个ERROR_broKEN_P
“这”是否保证指向C中的对象的开始？

我想使用fwrite将一个对象写入顺序文件.班级就像当我将一个对象写入文件时.我正在游荡,我可以使用fwrite(this,sizeof(int),2,fo)写入前两个整数.问题是：这是否保证指向对象数据的开始,即使对象的最开始可能存在虚拟表.所以上面的操作是安全的.解决方法这提供了对象的地址,这不一定是第一个成员的地址.唯一的例外是所谓的标准布局类型.从C11标准：(9.2/20)Apointe
c – 编译单元之间共享的全局const对象

当我声明并初始化一个const对象时.两个cpp文件包含此标头.和当我构建解决方案时,没有链接错误,你会得到什么如果g_Const是一个非const基本类型！PrintInUnit1()和PrintInUnit2()表明在两个编译单元中有两个独立的“g_Const”具有不同的地址,为什么？
什么是C名称查找在这里？ (&GCC对吗？)

为什么在第三个变体找到func,但是在实例化的时候,原始变体中不合格查找找不到func？解决方法一般规则是,任何不在模板定义上下文中的内容只能通过ADL来获取.换句话说,正常的不合格查找仅在模板定义上下文中执行.因为在定义中间语句时没有声明func,并且func不在与ns::type相关联的命名空间中,所以代码形式不正确.
c – 在输出参数中使用auto

有没有办法在这种情况下使用auto关键字：当然,不可能知道什么类型的.因此,解决方案应该是以某种方式将它们合并为一个句子.这可用吗？解决方法看起来您希望默认初始化给定函数期望作为参数的类型的对象.您无法使用auto执行此操作,但您可以编写一个特征来提取函数所需的类型,然后使用它来声明您的变量：然后你就像这样使用它：当然,只要你重载函数,这一切都会失败.
在C中说“推动一切浮动”的确定性方式

鉴于我更喜欢将程序中的数字保留为int或任何内容,那么使用这些数字的浮点数等效的任意算术最方便的方法是什么？说,我有我想写通过将转换放在解析的运算符树叶中,无需将表达式转化为混乱是否可以使用C风格的宏？应该用新的类和重载操作符完成吗？解决方法这是一个非常复杂的表达.更好地给它一个名字：现在当您使用整数参数调用它时,由于参数的类型为double,因此使用常规的算术转换将参数转换为double用C11lambda……
objective-c – 如何获取未知大小的NSArray的第一个X元素？

在objectiveC中,我有一个NSArray,我们称之为NSArray*largeArray,我想要获得一个新的NSArray*smallArray,只有第一个x对象…
c – Setprecision是混乱

我只是想问一下setprecision,因为我有点困惑.这里是代码：其中x=以下：方程的左边是x的值.1.105=1.10应为1.111.115=1.11应为1.121.125=1.12应为1.131.135=1.14是正确的1.145=1.15也正确但如果x是：2.115=2.12是正确的2.125=2.12应为2.13所以为什么在一定的价值是正确的,但有时是错误的？请启发我谢谢解决方法没有理由期望使用浮点系统可以正确地表示您的帖子中的任何常量.因此,一旦将它们存储在一个双变量中,那么你所拥有的确切的一