Coverity & iComment

  • Finding bugs requires specifications
    • Can be obtained from automated tools
    • Can be inferred from source code, execution traces, comments

Finding errors without knowing the truth

Contradiction - cross-examine

  • Any contradiction is an error

Deviance - To infer correct behaviour

  • e.g. If 1000 of the times it does X and in one occasion it does Y it is probably an error

Cross-checking program belief systems

Must beliefs

  • Inferring from programming decisions that imply must have
  • Must be an error if not satisfied
x = *p / z; // MUST: p not null, z != 0

unlock(l) // MUST: l acquired
x++; // Must: x not protected by l

May beliefs

  • Inferring from programming decisions that imply may have
  • May be conincidental
scope1() {
    A();
    B();
}

scope2() {
    A();
    B();
}

scope3() {
    A();
    B();
}

// MAY: A() and B() mus tbe paired
  • Check as MUST beliefs
  • Rank errors by belief confidence (the probability that the error is positive inferred from existing code)

Trivial consistency: NULL pointers

*p implies MUST belief that p is not null

A check (p == NULL) implies two MUST beliefs:

POST - p is null on true path, not null on false path

PRE - p was unknown before check

Redundancy checking

  • Assume code is supposed to be useful
  • Useless actions (low-level redundancies) may lead to high level bugs

e.g. x = x, 1*y, x&x, x|x

  • Assignments that are never used in subsequent code

Handling MAY beliefs

  • MUST beliefs only need a single contradiction
  • MAY beliefs need many examples to separate fact from coincidence

  • Assume MAY beliefs are MUST beliefs

  • Record every successful check with an error message

  • Every unsuccesfull check with an "error" message

  • Rank errors based on ratio of checks (n) to errors (err)

  • Ones where n is large and err is mall are most likely to be errors

Deriving deallocation routines

Infer free functions

  • If pointer p is not used after calling foo(p), it implies MAY belief that foo() is a free function

  • Conceptualy, assume all function free all arguments

    • Emit a "check" messsage at every call site
    • Emit an "error" message at every use

Deriving routines that can fail

  • Rank errors based on number of checks to non-checks

  • Assume all functions can return NULL

  • If pointer checked before use, emit "check"message
  • If pointer used before check, emit "error"
  • Sort errors based on ratio of checks to errors

Deriving "A() must be followed by B()"

a(); ... b(); implies MAY belief that a() follows b()

  • May be a coincidence

  • Assume every a-b is a valid pair

  • Emit "check" for each path that has a() then b()
  • Emit "error" for each path that has a() and no b()

Comments for Reliability

  • Some specifications/rules
    • Calling context
    • Calling order
    • Unit
    • Help ensure correct software evolution
  • It is feasible to automatically extract specs from comments to detect comment-code mismatches
    • Program analysis
    • NLP
    • Machine learning
    • Statistics
  • Use comment-code redundancy to detect comment-code mistaches
    • A mistach could indicate
      • Bugs
      • Bad comments

NLP

  • Analyze sentence structures

    • POS tagging
    • Chuncking
    • Semantic Role Labeling
  • Impossible to automatically analyze any arbitrary comments

results matching ""

    No results matching ""