LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 12319 - The x86 backend can introduce branches that depend on uninitialized values
Summary: The x86 backend can introduce branches that depend on uninitialized values
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: trunk
Hardware: PC Linux
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-20 19:34 PDT by Rafael Ávila de Espíndola
Modified: 2020-06-14 09:35 PDT (History)
6 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Ávila de Espíndola 2012-03-20 19:34:32 PDT
At mozilla it is common to see code that looks like

void f(void);
void g(void);
_Bool h(int *a, int *b);
void test() {
  int a, b;
  if (!h(&a, &b) || a == 42 || b == 33)
    f();
  else
    g();
}

The idea is that function h should make sure it initializes a and b *when it returns true*. Clang + IL optimizations produce

--------------------------
define void @test() nounwind uwtable {
entry:
  %a = alloca i32, align 4
  %b = alloca i32, align 4
  %call = call zeroext i1 @h(i32* %a, i32* %b) nounwind
  %call.not = xor i1 %call, true
  %0 = load i32* %a, align 4, !tbaa !0
  %cmp = icmp eq i32 %0, 42
  %or.cond = or i1 %cmp, %call.not
  %1 = load i32* %b, align 4, !tbaa !0
  %cmp2 = icmp eq i32 %1, 33
  %or.cond3 = or i1 %or.cond, %cmp2
  br i1 %or.cond3, label %if.then, label %if.else
if.then:
  call void @f() nounwind
  br label %if.end
if.else:
  call void @g() nounwind
  br label %if.end
if.end
  ret void
}
------------------------------

Which is fine, since the branch will never depend on an uninitialized value unless there is a bug in h. The problem is that the x86 backend splits the BB:

----------------------------------------------
	callq	h
	cmpl	$42, 4(%rsp)
	je	.LBB0_3
	xorb	$1, %al
	testb	%al, %al
	jne	.LBB0_3
        ...
---------------------------------------------

Now the first branch is using a potentially uninitialized value. The full function behavior stays the same: if the value is uninitialized then h returned false and we will end up branching to .LBB0_3 anyway.

The problem is that valgrind reports an invalid use as soon as it sees the first branch.

Any ideas on how to fix this? Changing the optimization to first check expressions that don't access memory would fix this example, but any splitting can introduce an uninitialized use for a given example.
Comment 1 Kostya Serebryany 2012-03-20 19:43:27 PDT
FYI: we have seen this and similar problems with Valgrind on gcc-compiled binaries as well. The best solution we could find was just to disable the guilty optimizations in gcc. 
I doubt that llvm or gcc should change the default behavior here, 
but it would be nice to have flags that disable Valgrind-unfriendly optimizations.
Comment 2 Kostya Serebryany 2012-03-20 19:46:06 PDT
FTR: this is one of the problems we had with gcc -O1: 

struct S {                                                                                                                                                                                          
  double unused;  // for alignment.                                                                                                                                                                      
  int a, b;  // adjacent, a is 8 aligned 
};                                                                                                                                                                                                  
                                                                                                                                                                                                   
bool foo(S *s) {
  // compiler combines this into 
  // *(int64_t*)(&s->a) == 0x45600000123
  return s->a == 0x123 && s->b == 0x456;                                                                                                                                                            
}                                                                                                                                                                                                   
                                                                                                                                                                                                    
int main() {                                                                                                                                                                                        
  S s; // uninitialized
  s.a = 0;  // initialize only a, not b                                                                                                                                                                                          
  if (foo(&s))                                                                                                                                                                                      
    return 1;                                                                                                                                                                                       
  return 0;                                                                                                                                                                                         
}                   

objdump: 
00000000004005b4 <_Z3fooP1S>:
  4005b4:       48 b8 23 01 00 00 56    mov    $0x45600000123,%rax
  4005bb:       04 00 00 
  4005be:       48 39 47 08             cmp    %rax,0x8(%rdi)
  4005c2:       0f 94 c0                sete   %al
  4005c5:       c3                      retq