CS611: Program Analysis

“Happy to see that IIT Mandi has taken the initiative to float such an important and useful course.”

- Reviewers 

Offering: Feb-Jun 2021 (Hybrid)

Class strength: 21

Slack: cs611spring21.slack.com (use your @students.iitmandi.ac.in ID)

About the course

Program analysis approximates the runtime behavior of a program by looking at its source code. The results of program analyses are used to drive a plethora of applications: performance-oriented optimizations, program understanding, verification, debugging, refactoring, finding vulnerabilities, test generation, parallelization, and so on. As a result, once an exclusive part of the back-end of optimizing compilers, now program analysis finds its applications in the development of a large number of tools in a programming languageā€™s ecosystem. The aim of this course is threefold: (i) to develop a thorough understanding of the theory behind the discipline of analyzing programs; (ii) to explore modern strategies that balance the trade-offs between the precision of an analysis and the resources consumed therein; and (iii) to get introduced to various applications that benefit from the results of a precise-yet-efficient program analysis.

How shall we learn?

Being an advanced elective, the classes will often be discussion-based, with sometimes the instructor teaching but many times the students finding problems, suggesting ideas, critiquing the state-of-the-art, and so on. The hope is that everyone either comes up with something new, and/or gets excited to learn much more, in near future.

What shall we use?

In the programming assignments, we would be using Soot, one of the most popular industry-standard open-source tools for extracting information from as well as transforming Java Bytecode (thus programs written in any JVM-compatible language, such as Java, Scala and Clojure, can be analyzed using Soot).

Textbooks:

None! We would explore (and create?) knowledge that is yet to get old enough for textbooks.

References:

1. Flemming Nielson, Hanne Riis Nielson and Chris Hankin. “Principles of Program Analysis”, Corrected Edition, Springer, 1999.
2. Uday P. Khedker, Amitabha Sanyal and Bageshri Karkare. “Data Flow Analysis: Theory and Practice”, First Edition, CRC Press, 2009.
3. Anders Moeller and Michael I. Schwaryzbach. “Lecture Notes on Static Program Analysis”, PDF, 2020.
4. Y. N. Srikant and Priti Shankar. “The Compiler Design Handbook”, Second Edition, CRC Press, 2007.
5. Various research papers related to the course content.

Course structure

  • Credits: 3-1-0-4

  • Prerequisites: Any one of

    • A compilers course

    • CS202 (Data Structures and Algorithms) && CS208 (Mathematical Foundations of Computer Science) && CS304 (Formal Languages and Automata Theory)

    • Consent of teacher

  • Audience: Elective for CSE and EE students

  • Instructor: Manas Thakur

  • TA: Hrishikesh Tiwary (d19037)

  • Slot:

    • Registration: Free slot

    • Classes: G3 (Tue 10am; Thu 1pm; Fri 11am) + Occasional tutorials

Outline:

1. Introduction to static analysis
2. Heap analysis
3. Interprocedural analysis
4. Strategies for efficiency
5. Language features and challenges
6. Sneak peak into applications

Schedule

Content Remarks
Week 1 Feb 17, 19:
  • Introduction and logistics

  • Program representations

Getting started with Soot
Week 2 Feb 23, 25, 26, 27:
  • Abstract interpretation

  • Galois connection

  • Iterative data-flow analysis (IDFA)

Feb 23 HW: Posets and lattices
Tutorial on 27 at 12 pm [IDFA code]
Week 3 Mar 2, 4, 5:
  • SSA form

  • DU/UD chains

  • Redundancy elimination

PA1 released (due March 13)
Week 4 Mar 9, 12:
Flow-sensitive PTA
  • Heap abstractions

  • Points-to graphs

Week 5 Mar 16, 18, 19:
Flow-insensitive PTA
  • Andersen's analysis

  • Constraint-graph formulation

  • Steensgard's analysis

Week 6 Mar 23, 25, 26:
Interprocedural analysis
  • Exploded CFGs and interprocedurally invalid paths

  • Functional and call-string approaches

  • Value contexts

Extra tutorial on 26
Week 7 Mar 30; Apr 1:
  • Escape analysis

  • LSRV contexts

Mid sem on April 3
Week 8 Apr 6, 8, 9:
  • Call-graph construction

  • Object sensitivity

  • Type sensitivity

PA2 released (due April 18)
Week 9 Apr 13, 15, 16:
  • Mid-sem and PA2 discussion

  • Effects of heap cloning

  • Strong and weak updates

Week 10 Apr 20, 23:
  • Slicing

  • Program dependence graphs

PA3 released (due May 1)
Week 11 Apr 27, 29, 30:
  • Just-in-time compilation

  • JIT optimizations

  • The PYE framework

Week 12 May 4, 6, 7: BDD Slides
Classes suspended (Covid curfew) from May 6
Week 13 May 15:
  • Revision and discussion

Mini projects allotted on May 14 (due Jun 9)
Class on May 15 at 11 am
Week 14 May 18, 20, 21, 22:
  • Quirky language features

  • Analyzing parallel programs

  • May-Happen-in-Parallel (MHP) analysis

  • Discussion session

Tuesday schedule on May 22
Week 15 May 27, 28, 29:
  • Project presentations * 3

Friday schedule on May 25
Class on May 29 at 2:30 pm
Week 16 Jun 1, 3, 4:
  • Project presentations * 2

  • Software refactoring

  • Incremental analysis

  • Parallelizing program analyses

Last day of teaching on Jun 4

Evaluation

  • Mid-sem exam: 25

  • End-sem exam: 25

  • Programming assignments: 25

  • Paper reading or mini project: 20

  • Discussion: 5