Things behind `sys.settrace()`
As I was digging into CPython’s internals in order to implement a new feature of a virtual machine made recently, I guess that I probably solved a TODO task listed in the source code of byterun
to a certain degree.
This story begins with an interesting idea of using pdb
in bytefall
virtual machine…
(To avoid confusion and being affected by the changes among versions, here we take CPython 3.7 as the runtime)
When you call pdb.set_trace()
Typically, you need the following line to enter the debugging mode provided by pdb
:
1 | import pdb; pdb.set_trace() |
Once that line is executed, you are able to use all commands provide by pdb
or have a good time with REPL inside the interactive mode. However, in prior versions of bytefall
, internal execution flow of bytefall
virtual machine will be revealed when debugging mode is activated.
It’s such a tsunami for users who just want to check how their code work. Because they will not only be overwhelmed by too much information, but also get stucked in those frames with data related to their code indirectly.
So, our goal is to make user able to use pdb
like the way they are using in a normal Python runtime. To achieve this goal, we have to understand how pdb.set_trace()
work first. But before diving into source code, we need to figure out what we are interesting in, and it will help us focus on the problem we want to solve.
As we want to control the information exposure of internal execution, what we need to trace are most likely some operations related to frame
. Because frame
contains information of executing scope, and it also related to the call stack which we can get from traceback
.
It sounds like a reasonable guess, so let’s start to take a quick look at the implementation of pdb
.
-
Starting from the call
pdb.set_trace()
, we can speculate thatset_trace
is a function defined in the module scope ofpdb
. And here is what we got inpdb.py
:1
2
3
4
5
6
7
8
9# @pdb.py::set_trace
def set_trace(*, header=None):
pdb = Pdb()
if header is not None:
# Print `header` to stdout
pdb.message(header)
# The code we want to know
pdb.set_trace(sys._getframe().f_back)In those lines above, we know that there is an instance of
Pdb
class being instantiated and its methodset_trace()
is called.
Also,set_trace()
is the only operation related to frame here, so that it’s what we are interesting in. However, we cannot findset_trace()
inPdb
class, so it’s probably defined in its parentbdb.Bdb
. -
After navigating to the file
bdb.py
, we found it indeed:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21# @bdb.py::class Bdb::set_trace
def set_trace(self, frame=None):
"""Start debugging from frame.
If frame is not specified, debugging starts from caller's frame.
"""
if frame is None:
frame = sys._getframe().f_back
self.reset()
# Install callback `trace_dispatch` to each frame, and set `botframe` up
while frame:
frame.f_trace = self.trace_dispatch
self.botframe = frame
frame = frame.f_back
# Set information for `step` command
self.set_step()
# The code we want to know
sys.settrace(self.trace_dispatch)According to the documentation of inspect and sys, we can know that
f_trace
is used to store a callback function for tracing code, and that callback function will be installed throughsys.settrace()
.And what will the callback function
trace_dispatch()
do?1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21def trace_dispatch(self, frame, event, arg):
# Details about this callback function is also documented in its
# docstring.
if self.quitting:
return # None
if event == 'line':
return self.dispatch_line(frame)
if event == 'call':
return self.dispatch_call(frame, arg)
if event == 'return':
return self.dispatch_return(frame, arg)
if event == 'exception':
return self.dispatch_exception(frame, arg)
if event == 'c_call':
return self.trace_dispatch
if event == 'c_exception':
return self.trace_dispatch
if event == 'c_return':
return self.trace_dispatch
print('bdb.Bdb.dispatch: unknown debugging event:', repr(event))
return self.trace_dispatchIt works like the way how it’s described in document. When it’s invoked with argument
event
of'line', 'call', 'return', 'exception'
, it will call those functions corresponding to each event. But it’s irrelevant to the control of frame stack. In other word, it’s like a subscriber who is responsible to do something according to givenframe
, not an issuer which is able to determine when to fire an event with specificframe
. Therefore, it seemstrace_dispatch()
is not a function able to control the information exposure of internal execution.But it’s fine, let’s keep tracing deeper into
sys.settrace()
. -
Since
sys
module is not written in Python, we have to access it from CPython source code. And here is howsys.settrace()
looks like:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15// @cpython/Python/sysmodule.c::sys_settrace
// signature in Python:
// `settrace(function)`
static PyObject *
sys_settrace(PyObject *self, PyObject *args) {
/* omitted code */
if (args == Py_None)
PyEval_SetTrace(NULL, NULL);
else
// args: your Python callback function
PyEval_SetTrace(trace_trampoline, args);
/* omitted code */
}According to the signature of
sys.settrace()
, we can know thatargs
is the callback function passed from Python side. And onceargs
is notNone
,PyEval_SetTrace()
will be called with a function namedtrace_trampoline()
and the callback function we just passed in.Next, let’s see how
trace_trampoline()
is implemented.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33// @cpython/Python/sysmodule.c::trace_trampoline
static int
trace_trampoline(PyObject *self, PyFrameObject *frame,
int what, PyObject *arg) {
/* omitted code */
// Determine the callback function
if (what == PyTrace_CALL)
callback = self;
else
callback = frame->f_trace;
/* omitted code */
result = call_trampoline(callback, frame, what, arg);
/* omitted code */
}
// @cpython/Python/sysmodule.c::call_trampoline
static PyObject *
call_trampoline(PyObject* callback,
PyFrameObject *frame, int what, PyObject *arg) {
/* omitted code */
stack[0] = (PyObject *)frame;
stack[1] = whatstrings[what];
stack[2] = (arg != NULL) ? arg : Py_None;
/* call the Python-level function */
result = _PyObject_FastCall(callback, stack, 3);
/* omitted code */
}Remember what the signature of
trace_dispatch()
looks like? The last 3 arguments are the same as those in signatures oftrace_trampoline()
andcall_trampoline()
.
So, based on these implementation and function signatures, we can speculate thattrace_trampoline()
would play a role of invoking callback function while it is triggered.But still, these 2 C-functions are not used to control frame stack. There is one remaining function for us to keep investigating:
PyEval_SetTrace()
. -
Prefix of
PyEval_SetTrace()
indicating that this function is related to the bytecode dispatching loop locating inceval.c
, and here is a sketch of it:1
2
3
4
5
6
7
8
9
10
11
12
13// @cpython/Python/ceval.c::PyEval_SetTrace
void
PyEval_SetTrace(Py_tracefunc func, PyObject *arg)
{
PyThreadState *tstate = PyThreadState_GET();
/* omitted code */
tstate->c_tracefunc = func;
tstate->c_traceobj = arg;
/* omitted code */
}We can see that there are 2 attributes of
tstate
is set, and they are also the input arguments of this function. To make it clear, let’s recall the call stack starting fromsys.settrace()
in Python with a custom callback function named inpy_callback
:1
2
3
4
5
6
7
8
9
10
11# --- In Python ---
sys.settrace(py_callback)
# --- In aspect of C ---
# ->
# signature: `sys_settrace(PyObject *self, PyObject *args)`
sys_settrace(..., py_callback)
# ->
# signature: `PyEval_SetTrace(Py_tracefunc func, PyObject *arg)`
PyEval_SetTrace(trace_trampoline, py_callback)Therefore, we can know that
func
andarg
are actually:1
2
3// in PyEval_SetTrace()
tstate->c_tracefunc = func; // trace_trampoline
tstate->c_traceobj = arg; // py_callbackAnd we can be sure that
pdb.set_trace()
is just an operation of setting up things for tracing code, instead of an operation related to the control of frame stack. But since we got some clues abouttstate->c_tracefunc
andtstate->c_traceobj
, we can still go further into the huge loop inside_PyEval_EvalFrameDefault()
.
Take a look at the huge evaluation loop
Before analyzing _PyEval_EvalFrameDefault()
, we can find out those functions which take either tstate->c_tracefunc
or tstate->c_traceobj
as its arguments. This could make us focus on those parts which we are interested in.
And here are the signatures of those functions we found:
1 | // @cpython/Python/ceval.c |
Bases on this, we can simplify the implementation of _PyEval_EvalFrameDefault()
to the following one:
1 | // @cpython/Python/ceval.c::_PyEval_EvalFrameDefault |
With the simplified code above, we can roughly understand the execution flow of a CPython interpreter. Besides, timing of each trace event documented in trace_dispatch()
is also found out now (those sections are marked at the code block above):
- ‘call’: section_02, PyTrace_CALL
- ‘line’: section_03, PyTrace_LINE
- ‘exception’: section_06, PyTrace_EXCEPTION
- ‘return’: section_08, PyTrace_RETURN
Therefore, answer of our guess mentioned in previous section is revealed:
- Frame stack is not controlled by a single function, it related to the recursive execution of
_PyEval_EvalFrameDefault()
. (further reading: cpython/Objects/call.c) - There is nothing like an event issuer, tracing events are issued after entering those execution blocks.
So, how would pdb
work with line tracing
There is an interesting function named maybe_call_line_trace()
, and you can see that it will be invoked whenever there is a new opcode
is going to be dispatched (see also here).
But we should note that there is a prefix maybe
, which indicates that function actually won’t invoke a trace function everytime when it is executed. It does make sense, because what is going to be dispatched here is bytecode, rather than source code we written. And one line of source code usually can be compiled to multiple lines of bytecode.
Now, another question comes up: “How does it knows when to invoke trace function for line tracing since maybe_call_line_trace()
locates inside a loop for dispatching bytecode?”
It’s not a hard problem to be answered after checking out how it is implemented:
1 | // @cpython/Python/ceval.c::maybe_call_line_trace |
Clearly, we can know that:
- There is a function
_PyCode_CheckLineNumber()
which is used to find the upper and lower bound of index of bytecode instruction (denoted asinstr_lb
andinstr_ub
; further reading: line number table). frame.f_lasti
is an index of the lastest executed bytecode instruction.- Just like the comment shown above: once we are just entering a new interval of bytecode instructions (
frame->f_lasti == *instr_lb
) or an operation of jumping backward has been executed (e.g. block offor
/while
loop), trace function will be invoked (of course, only when the flagframe->f_trace_lines
is set). - If the flag
frame->f_trace_opcodes
is set, trace function will be invoked with an event argumentPyTrace_OPCODE
.
Besides, we got a useful information from it:
Line number of source code is evaluated only when it need to be used.
Though that is not a topic of this article, it shows itself as a part of the reason why a Python script runs slower under a line tracing mode or profiling mode.
In bytefall
…
Figuring out how the mechanism of line tracing works in CPython not only helps me to complete the part of bytecode dispatch in bytefall
VM, but also make it possible to run pdb
in bytefall
VM like the way it is used in a normal Python runtime.
But it’s just a part of implementation of this feature, we still need to make some modifications to make it work properly. If you are interested in it, you can also check out this file for the Pdb
wrapper I wrote.
Further reading
If you are interested in this topic, I recommend you read this nice article: “How C trace functions really work - by Ned Batchelder”.