Coverity & iComment

Finding bugs requires specifications
- Can be obtained from automated tools
- Can be inferred from source code, execution traces, comments

Finding errors without knowing the truth

Contradiction - cross-examine

Any contradiction is an error

Deviance - To infer correct behaviour

e.g. If 1000 of the times it does X and in one occasion it does Y it is probably an error

Cross-checking program belief systems

Must beliefs

Inferring from programming decisions that imply must have
Must be an error if not satisfied

x = *p / z; // MUST: p not null, z != 0

unlock(l) // MUST: l acquired
x++; // Must: x not protected by l

May beliefs

Inferring from programming decisions that imply may have
May be conincidental

scope1() {
    A();
    B();
}

scope2() {
    A();
    B();
}

scope3() {
    A();
    B();
}

// MAY: A() and B() mus tbe paired

Check as MUST beliefs
Rank errors by belief confidence (the probability that the error is positive inferred from existing code)

Trivial consistency: NULL pointers

*p implies MUST belief that p is not null

A check (p == NULL) implies two MUST beliefs:

POST - p is null on true path, not null on false path

PRE - p was unknown before check

Redundancy checking

Assume code is supposed to be useful
Useless actions (low-level redundancies) may lead to high level bugs

e.g. x = x, 1*y, x&x, x|x

Assignments that are never used in subsequent code

Handling MAY beliefs

MUST beliefs only need a single contradiction
MAY beliefs need many examples to separate fact from coincidence
Assume MAY beliefs are MUST beliefs
Record every successful check with an error message
Every unsuccesfull check with an "error" message
Rank errors based on ratio of checks (n) to errors (err)
Ones where n is large and err is mall are most likely to be errors

Deriving deallocation routines

Infer free functions

If pointer p is not used after calling foo(p), it implies MAY belief that foo() is a free function
Conceptualy, assume all function free all arguments
- Emit a "check" messsage at every call site
- Emit an "error" message at every use

Deriving routines that can fail

Rank errors based on number of checks to non-checks
Assume all functions can return NULL
If pointer checked before use, emit "check"message
If pointer used before check, emit "error"
Sort errors based on ratio of checks to errors

Deriving "A() must be followed by B()"

a(); ... b(); implies MAY belief that a() follows b()

May be a coincidence
Assume every a-b is a valid pair
Emit "check" for each path that has a() then b()
Emit "error" for each path that has a() and no b()

Comments for Reliability

Some specifications/rules
- Calling context
- Calling order
- Unit
- Help ensure correct software evolution

It is feasible to automatically extract specs from comments to detect comment-code mismatches
- Program analysis
- NLP
- Machine learning
- Statistics
Use comment-code redundancy to detect comment-code mistaches
- A mistach could indicate
  - Bugs
  - Bad comments

NLP

Analyze sentence structures
- POS tagging
- Chuncking
- Semantic Role Labeling
Impossible to automatically analyze any arbitrary comments

Coverity & iComment

Coverity & iComment

Finding errors without knowing the truth

Cross-checking program belief systems

Must beliefs

May beliefs

Trivial consistency: NULL pointers

Redundancy checking

Handling MAY beliefs

Deriving deallocation routines

Infer free functions

Deriving routines that can fail

Deriving "A() must be followed by B()"

Comments for Reliability

NLP

results matching ""

No results matching ""