Renderd7-nightly  v0.7.0
nlohmann::detail::lexer< BasicJsonType, InputAdapterType > Class Template Reference

lexical analysis More...

#include <json.hpp>

Inherits nlohmann::detail::lexer_base< BasicJsonType >.

Public Member Functions

constexpr number_integer_t get_number_integer () const noexcept
 return integer value
 
constexpr number_unsigned_t get_number_unsigned () const noexcept
 return unsigned integer value
 
constexpr number_float_t get_number_float () const noexcept
 return floating-point value
 
string_t & get_string ()
 return current string value (implicitly resets the token; useful only once)
 
constexpr position_t get_position () const noexcept
 return position of last read token
 
std::string get_token_string () const
 
constexpr const JSON_HEDLEY_RETURNS_NON_NULL char * get_error_message () const noexcept
 return syntax error message
 
bool skip_bom ()
 skip the UTF-8 byte order mark More...
 

Private Member Functions

int get_codepoint ()
 get codepoint from 4 hex characters following \u More...
 
bool next_byte_in_range (std::initializer_list< char_int_type > ranges)
 check if the next byte(s) are inside a given range More...
 
token_type scan_string ()
 scan a string literal More...
 
bool scan_comment ()
 scan a comment More...
 
token_type scan_number ()
 scan a number literal More...
 
token_type scan_literal (const char_type *literal_text, const std::size_t length, token_type return_type)
 
void reset () noexcept
 reset token_buffer; current character is beginning of token
 
void unget ()
 unget current character (read it again on next get) More...
 
void add (char_int_type c)
 add a character to token_buffer
 

Static Private Member Functions

static JSON_HEDLEY_PURE char get_decimal_point () noexcept
 return the locale-dependent decimal point
 

Private Attributes

InputAdapterType ia
 input adapter
 
const bool ignore_comments = false
 whether comments should be ignored (true) or signaled as errors (false)
 
char_int_type current = std::char_traits<char_type>::eof()
 the current character
 
bool next_unget = false
 whether the next get() call should just return current
 
position_t position {}
 the start position of the current token
 
std::vector< char_type > token_string {}
 raw input token string (for error messages)
 
string_t token_buffer {}
 buffer for variable-length tokens (numbers, strings)
 
const char * error_message = ""
 a description of occurred lexer errors
 
const char_int_type decimal_point_char = '.'
 the decimal point
 

Detailed Description

template<typename BasicJsonType, typename InputAdapterType>
class nlohmann::detail::lexer< BasicJsonType, InputAdapterType >

lexical analysis

This class organizes the lexical analysis during JSON deserialization.

Member Function Documentation

◆ get_codepoint()

template<typename BasicJsonType , typename InputAdapterType >
int nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::get_codepoint ( )
inlineprivate

get codepoint from 4 hex characters following \u

For input "\u c1 c2 c3 c4" the codepoint is: (c1 * 0x1000) + (c2 * 0x0100) + (c3 * 0x0010) + c4 = (c1 << 12) + (c2 << 8) + (c3 << 4) + (c4 << 0)

Furthermore, the possible characters '0'..'9', 'A'..'F', and 'a'..'f' must be converted to the integers 0x0..0x9, 0xA..0xF, 0xA..0xF, resp. The conversion is done by subtracting the offset (0x30, 0x37, and 0x57) between the ASCII value of the character and the desired integer value.

Returns
codepoint (0x0000..0xFFFF) or -1 in case of an error (e.g. EOF or non-hex character)

◆ next_byte_in_range()

template<typename BasicJsonType , typename InputAdapterType >
bool nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::next_byte_in_range ( std::initializer_list< char_int_type >  ranges)
inlineprivate

check if the next byte(s) are inside a given range

Adds the current byte and, for each passed range, reads a new byte and checks if it is inside the range. If a violation was detected, set up an error message and return false. Otherwise, return true.

Parameters
[in]rangeslist of integers; interpreted as list of pairs of inclusive lower and upper bound, respectively
Precondition
The passed list ranges must have 2, 4, or 6 elements; that is, 1, 2, or 3 pairs. This precondition is enforced by an assertion.
Returns
true if and only if no range violation was detected

◆ scan_string()

template<typename BasicJsonType , typename InputAdapterType >
token_type nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::scan_string ( )
inlineprivate

scan a string literal

This function scans a string according to Sect. 7 of RFC 7159. While scanning, bytes are escaped and copied into buffer token_buffer. Then the function returns successfully, token_buffer is not null-terminated (as it may contain \0 bytes), and token_buffer.size() is the number of bytes in the string.

Returns
token_type::value_string if string could be successfully scanned, token_type::parse_error otherwise
Note
In case of errors, variable error_message contains a textual description.

◆ scan_comment()

template<typename BasicJsonType , typename InputAdapterType >
bool nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::scan_comment ( )
inlineprivate

scan a comment

Returns
whether comment could be scanned successfully

◆ scan_number()

template<typename BasicJsonType , typename InputAdapterType >
token_type nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::scan_number ( )
inlineprivate

scan a number literal

This function scans a string according to Sect. 6 of RFC 7159.

The function is realized with a deterministic finite state machine derived from the grammar described in RFC 7159. Starting in state "init", the input is read and used to determined the next state. Only state "done" accepts the number. State "error" is a trap state to model errors. In the table below, "anything" means any character but the ones listed before.

state 0 1-9 e E + - . anything
init zero any1 [error] [error] minus [error] [error]
minus zero any1 [error] [error] [error] [error] [error]
zero done done exponent done done decimal1 done
any1 any1 any1 exponent done done decimal1 done
decimal1 decimal2 decimal2 [error] [error] [error] [error] [error]
decimal2 decimal2 decimal2 exponent done done done done
exponent any2 any2 [error] sign sign [error] [error]
sign any2 any2 [error] [error] [error] [error] [error]
any2 any2 any2 done done done done done

The state machine is realized with one label per state (prefixed with "scan_number_") and goto statements between them. The state machine contains cycles, but any cycle can be left when EOF is read. Therefore, the function is guaranteed to terminate.

During scanning, the read bytes are stored in token_buffer. This string is then converted to a signed integer, an unsigned integer, or a floating-point number.

Returns
token_type::value_unsigned, token_type::value_integer, or token_type::value_float if number could be successfully scanned, token_type::parse_error otherwise
Note
The scanner is independent of the current locale. Internally, the locale's decimal point is used instead of . to work with the locale-dependent converters.

◆ scan_literal()

template<typename BasicJsonType , typename InputAdapterType >
token_type nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::scan_literal ( const char_type *  literal_text,
const std::size_t  length,
token_type  return_type 
)
inlineprivate
Parameters
[in]literal_textthe literal text to expect
[in]lengththe length of the passed literal text
[in]return_typethe token type to return on success

◆ unget()

template<typename BasicJsonType , typename InputAdapterType >
void nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::unget ( )
inlineprivate

unget current character (read it again on next get)

We implement unget by setting variable next_unget to true. The input is not changed - we just simulate ungetting by modifying chars_read_total, chars_read_current_line, and token_string. The next call to get() will behave as if the unget character is read again.

◆ get_token_string()

template<typename BasicJsonType , typename InputAdapterType >
std::string nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::get_token_string ( ) const
inline

return the last read token (for errors only). Will never contain EOF (an arbitrary value that is not a valid char value, often -1), because 255 may legitimately occur. May contain NUL, which should be escaped.

◆ skip_bom()

template<typename BasicJsonType , typename InputAdapterType >
bool nlohmann::detail::lexer< BasicJsonType, InputAdapterType >::skip_bom ( )
inline

skip the UTF-8 byte order mark

Returns
true iff there is no BOM or the correct BOM has been skipped