Remember that the BASIC interpreter tokenizes, so it takes less space than the raw text. BASIC keywords are collapsed down to a single byte.
For example, a REM statement like
100 REM A test
should take up 10 bytes in memory due to the tokenization that occurs, even though the text is 14 chars long. The format in memory is something like <0xE0 + length byte>,<line no high byte>,<line no low byte>, <line data>...
So the above REM line would look like the following in memory:
0xE7, 0x00, 0x64, 0x87,'A',' ','t','e','s','t'
It's all in the basic.c file.