YAML C语言范例

http://pyyaml.org/wiki/LibYAML

https://www.wpsoftware.net/andrew/pages/libyaml.html

Introduction

This tutorial is an introduction to using the libyaml library with the C programming language. It assumes a basic knowledge of the YAML format. See Wikipedia for more information and links to the relevant sites.

More Information

All information used to write this page came from the header yaml.h, and trial-and-error.

YAML

What is YAML?

YAML stands for ``YAML Ain't a Markup Language'', and is a data-storage format designed for easy readability and machine parsing. It is more verbose than JSON, and less verbose than XML, two other formats designed for similar purposes.

It allows data to be nested arbitrarily deep, allows embedded newlines, and supports sequences, relational trees and all sorts of fancy stuff. I'll be shying away from the advanced features and sticking to the basics.

What Does it Look Like?

I am no YAML expert. I am learning the language as I write this — in fact, before this morning I had never used YAML, though I'd heard of it. Arguably, I am the last person who should be writing this tutorial. But last I looked, there was no decent beginner's tutorial for libyaml, so I'm doing it.

Here is the file I will be parsing:

# config/public.yaml

title   : Finex 2011
img_url : /finex/html/img/
css_url : /finex/html/style/
js_url  : /finex/html/js/

template_dir: html/templ/

default_act : idx    # used for invalid/missing act=

pages:
  - act   : idx
    title : Welcome
    html  : public/welcome.phtml
  - act   : reg
    title : Register
    html  : public/register.phtml
  - act   : log
    title : Log in
    html  : public/login.phtml
  - act   : out
    title : Log out
    html  : public/logout.phtml

One non-obvious (from this file) thing about YAML is that it forbids the tab character. Why? Because YAML depends on indentation for its structure, and tabs tend to mess things up.

This is part of my ongoing project to port Finex away from PHP. (Why C? I'm trying quite a few languages, and C happens to be today's.) As you can see, this is a simple file I'll be using to instruct my templating system on how to read the act=XXX portion of a URL.

libyaml

libyaml is a C library for parsing YAML files, and probably available from your package manager. To use it, include the file yaml.h and add the linker flag -lyaml to gcc.

For detailed information, check out the header file /usr/include/yaml.h.

yaml_parser_t

The primary object used by libyaml is the parser itself. This is an object of type yaml_parser_t. It must be allocated manually (usually on the stack) and is initialized/deinitialized by the functions:

int yaml_parser_initialize(yaml_parser_t *)
void yaml_parser_delete(yaml_parser_t *)

All error codes are returned as ints. 1 signifies success, 0 failure. Next, to open a specific file, we use the function:

void yaml_parser_set_input_file(yaml_parser_t *parser, FILE *file)

There are also functions to read input from a string or generic read handler, and to set the encoding of an input file. We won't cover those here, but be aware.

Our code thus far is:
#include <stdio.h>
#include <yaml.h>

int main(void)
{
  FILE *fh = fopen("config/public.yaml", "r");
  yaml_parser_t parser;

  /* Initialize parser */
  if(!yaml_parser_initialize(&parser))
    fputs("Failed to initialize parser!\n", stderr);
  if(fh == NULL)
    fputs("Failed to open file!\n", stderr);

  /* Set input file */
  yaml_parser_set_input_file(&parser, fh);

  /* CODE HERE */

  /* Cleanup */
  yaml_parser_delete(&parser);
  fclose(fh);
  return 0;
}

This should compile and run without error, though it doesn't do anything yet.

Token-Based and Stream-Based Parsing

There are two ways to parse a YAML document using libyaml: token-based and event-based. The simplest way, conceptually, is token-based. By using the functions

int yaml_parser_scan(yaml_parser_t *parser, yaml_token_t *token)
void yaml_token_delete(yaml_token_t *token)

we can get each token from the YAML document in turn. The full yaml_token_t structure can be found in yaml.h, but for our purposes we will only need the two fields .type and .data.scalar.value, which tell us the token type, and its data (if the type is YAML_SCALAR_TOKEN).

Though I do not cover every token type – again, see yaml.h for more – the following example code should be illustrative:

#include <stdio.h>
#include <yaml.h>

int main(void)
{
  FILE *fh = fopen("config/public.yaml", "r");
  yaml_parser_t parser;
  yaml_token_t  token;   /* new variable */

  /* Initialize parser */
  if(!yaml_parser_initialize(&parser))
    fputs("Failed to initialize parser!\n", stderr);
  if(fh == NULL)
    fputs("Failed to open file!\n", stderr);

  /* Set input file */
  yaml_parser_set_input_file(&parser, fh);

  /* BEGIN new code */
  do {
    yaml_parser_scan(&parser, &token);
    switch(token.type)
    {
    /* Stream start/end */
    case YAML_STREAM_START_TOKEN: puts("STREAM START"); break;
    case YAML_STREAM_END_TOKEN:   puts("STREAM END");   break;
    /* Token types (read before actual token) */
    case YAML_KEY_TOKEN:   printf("(Key token)   "); break;
    case YAML_VALUE_TOKEN: printf("(Value token) "); break;
    /* Block delimeters */
    case YAML_BLOCK_SEQUENCE_START_TOKEN: puts("<b>Start Block (Sequence)</b>"); break;
    case YAML_BLOCK_ENTRY_TOKEN:          puts("<b>Start Block (Entry)</b>");    break;
    case YAML_BLOCK_END_TOKEN:            puts("<b>End block</b>");              break;
    /* Data */
    case YAML_BLOCK_MAPPING_START_TOKEN:  puts("[Block mapping]");            break;
    case YAML_SCALAR_TOKEN:  printf("scalar %s \n", token.data.scalar.value); break;
    /* Others */
    default:
      printf("Got token of type %d\n", token.type);
    }
    if(token.type != YAML_STREAM_END_TOKEN)
      yaml_token_delete(&token);
  } while(token.type != YAML_STREAM_END_TOKEN);
  yaml_token_delete(&token);
  /* END new code */

  /* Cleanup */
  yaml_parser_delete(&parser);
  fclose(fh);
  return 0;
}

This simple loop reads every token from the document and prints it out. The output forpublic.yaml, indented to show block structure, is:

STREAM START  
  [Block mapping] 
  (Key token)   scalar title 
  (Value token) scalar Finex 2011 
  (Key token)   scalar img_url 
  (Value token) scalar /finex/html/img/ 
  (Key token)   scalar css_url 
  (Value token) scalar /finex/html/style/ 
  (Key token)   scalar js_url 
  (Value token) scalar /finex/html/js/ 
  (Key token)   scalar template_dir 
  (Value token) scalar html/templ/ 
  (Key token)   scalar pages 
  (Value token) Start Block (Sequence)
    Start Block (Entry)
      [Block mapping] 
      (Key token)   scalar act 
      (Value token) scalar idx 
      (Key token)   scalar title 
      (Value token) scalar Welcome 
      (Key token)   scalar html 
      (Value token) scalar public/welcome.phtml 
    End block
    Start Block (Entry)
      [Block mapping] 
      (Key token)   scalar act 
      (Value token) scalar reg 
      (Key token)   scalar title 
      (Value token) scalar Register 
      (Key token)   scalar html 
      (Value token) scalar public/register.phtml 
    End block
    Start Block (Entry)
      [Block mapping] 
      (Key token)   scalar act 
      (Value token) scalar log 
      (Key token)   scalar title 
      (Value token) scalar Log in 
      (Key token)   scalar html 
      (Value token) scalar public/login.phtml 
    End block
    Start Block (Entry)
      [Block mapping] 
      (Key token)   scalar act 
      (Value token) scalar out 
      (Key token)   scalar title 
      (Value token) scalar Log out 
      (Key token)   scalar html 
      (Value token) scalar public/logout.phtml 
    End block
  End block
End block
STREAM END  

It is clear that for simple documents, token-based parsing makes sense. However, a more natural paradigm is event-based parsing. This works by the similar functions

int yaml_parser_parse(yaml_parser_t *parser, yaml_event_t *event)
void yaml_event_delete(yaml_event_t *event)
The code using these functions, and its output, is as follows:
#include <stdio.h>
#include <yaml.h>

int main(void)
{
  FILE *fh = fopen("config/public.yaml", "r");
  yaml_parser_t parser;
  yaml_event_t  event;   /* New variable */

  /* Initialize parser */
  if(!yaml_parser_initialize(&parser))
    fputs("Failed to initialize parser!\n", stderr);
  if(fh == NULL)
    fputs("Failed to open file!\n", stderr);

  /* Set input file */
  yaml_parser_set_input_file(&parser, fh);

  /* START new code */
  do {
    if (!yaml_parser_parse(&parser, &event)) {
       printf("Parser error %d\n", parser.error);
       exit(EXIT_FAILURE);
    }

    switch(event.type)
    { 
    case YAML_NO_EVENT: puts("No event!"); break;
    /* Stream start/end */
    case YAML_STREAM_START_EVENT: puts("STREAM START"); break;
    case YAML_STREAM_END_EVENT:   puts("STREAM END");   break;
    /* Block delimeters */
    case YAML_DOCUMENT_START_EVENT: puts("<b>Start Document</b>"); break;
    case YAML_DOCUMENT_END_EVENT:   puts("<b>End Document</b>");   break;
    case YAML_SEQUENCE_START_EVENT: puts("<b>Start Sequence</b>"); break;
    case YAML_SEQUENCE_END_EVENT:   puts("<b>End Sequence</b>");   break;
    case YAML_MAPPING_START_EVENT:  puts("<b>Start Mapping</b>");  break;
    case YAML_MAPPING_END_EVENT:    puts("<b>End Mapping</b>");    break;
    /* Data */
    case YAML_ALIAS_EVENT:  printf("Got alias (anchor %s)\n", event.data.alias.anchor); break;
    case YAML_SCALAR_EVENT: printf("Got scalar (value %s)\n", event.data.scalar.value); break;
    }
    if(event.type != YAML_STREAM_END_EVENT)
      yaml_event_delete(&event);
  } while(event.type != YAML_STREAM_END_EVENT);
  yaml_event_delete(&event);
  /* END new code */

  /* Cleanup */
  yaml_parser_delete(&parser);
  fclose(fh);
  return 0;
}
STREAM START 
Start Document 
  Start Mapping 
    Got scalar (value title) 
    Got scalar (value Finex 2011) 
    Got scalar (value img_url) 
    Got scalar (value /finex/html/img/) 
    Got scalar (value css_url) 
    Got scalar (value /finex/html/style/) 
    Got scalar (value js_url) 
    Got scalar (value /finex/html/js/) 
    Got scalar (value template_dir) 
    Got scalar (value html/templ/) 
    Got scalar (value pages) 
    Start Sequence 
      Start Mapping 
        Got scalar (value act) 
        Got scalar (value idx) 
        Got scalar (value title) 
        Got scalar (value Welcome) 
        Got scalar (value html) 
        Got scalar (value public/welcome.phtml) 
      End Mapping 
      Start Mapping 
        Got scalar (value act) 
        Got scalar (value reg) 
        Got scalar (value title) 
        Got scalar (value Register) 
        Got scalar (value html) 
        Got scalar (value public/register.phtml) 
      End Mapping 
      Start Mapping 
        Got scalar (value act) 
        Got scalar (value log) 
        Got scalar (value title) 
        Got scalar (value Log in) 
        Got scalar (value html) 
        Got scalar (value public/login.phtml) 
      End Mapping 
      Start Mapping 
        Got scalar (value act) 
        Got scalar (value out) 
        Got scalar (value title) 
        Got scalar (value Log out) 
        Got scalar (value html) 
        Got scalar (value public/logout.phtml) 
      End Mapping 
    End Sequence 
  End Mapping 
End Document 
STREAM END 

There are two major things to notice about this. First, the output is better-structured; the generic ``End Block'' tokens are now specific events. Secondly, the code is simpler. Not only that, but my switch has every event type. As you may recall from the token-based code, the switchstatement was incomplete, and should have contained many  other token types, most of which I don't understand.

Also, an event-based approach is more amenable to object-oriented programming, and so is likely what you'll see when using libyaml with other languages.

Document-Based Parsing

I lied. There is actually a third way to parse YAML, based on the functions

int yaml_parser_load(yaml_parser_t *parser, yaml_document_t *document)
void yaml_document_delete(yaml_document_t *document)

These allow you to load individual documents into structures, and manipulate them using a variety of yaml_document_* functions. This is useful because YAML documents may be spread across multiple files, or individual files may contain many YAML documents. However, this is an advanced use case and I haven't looked into it.

.

Looking Forward

This concludes my tutorial. Looking forward, there are a few things to look out for. First, watch out for memory leaks. While YAML does not allocate and return complete objects, any function that populates an existing object will allocate buffers. To avoid leaking, be sure to use the appropriate yaml_*_delete function before re-populating an object.

Also, watch out for versioning. The current YAML version is 1.3, but the libyaml version I'm using only supports up to 1.2. If this is a concern for you, you can use the functions yaml_get_version and yaml_get_version_string to see what your library supports.

Watch out for encoding issues. This is the 21st century, after all. The STREAM_START event and token both have an .encoding field that you can check if you do not know your document encoding in advance.

Finally, you can build YAML trees in code. There are a variety of functions to create your own tokens, events and documents, and the yaml_emitter_t object (and associated functions) will allow you to output files. See yaml.h for details.


  • 1
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Python中的YAML可以通过PyYAML库来实现。PyYAML提供了将YAML格式数据解析为Python对象和将Python对象转换为YAML格式数据的方法。下面是一个简单的示例代码: ```python import yaml # 将YAML格式数据解析为Python对象 with open('example.yaml', 'r') as f: data = yaml.load(f, Loader=yaml.FullLoader) # 将Python对象转换为YAML格式数据 with open('example.yaml', 'w') as f: yaml.dump(data, f) ``` 在这个示例中,我们使用了`yaml.load()`方法将YAML格式数据解析为Python对象,并使用了`yaml.dump()`方法将Python对象转换为YAML格式数据。注意,在解析YAML格式数据时,我们传递了一个`Loader`参数,这是为了防止YAML中的一些安全漏洞而必须指定的。 ### 回答2: Python的yaml模块是一个处理YAML格式文件的库。YAML是一种人类友好的数据序列化格式,常用于配置文件、数据交换和简单的持久化存储。通过使用yaml模块,我们可以轻松地将YAML文件转换为Python对象,并将Python对象序列化为YAML格式。 主要功能: 1. 解析YAMLyaml模块提供了`load()`函数,可以将YAML文件的内容解析为Python对象,例如字典、列表等。这使得我们可以轻松地读取和访问YAML文件中的数据。 2. 序列化为YAML:使用yaml模块的`dump()`函数,我们可以将Python对象序列化为YAML格式的字符串,并将其写入文件。这使得我们可以方便地将Python对象保存为YAML文件。 3. 支持自定义对象:yaml模块可以序列化和反序列化用户自定义的对象。通过为对象定义`__repr__()`和`__init__()`方法,我们可以确保自定义对象能够正确地被序列化和反序列化。 4. 支持注释:yaml模块支持在YAML文件中添加注释。通过使用`#`字符,我们可以在YAML文件中添加注释,提高文件的可读性。 5. 支持引用:yaml模块支持引用其他部分的数据。通过使用`&`字符定义引用标记,并使用`*`字符引用该标记,我们可以在YAML文件中重复使用相同的值,提高文件的可重用性。 总之,Python的yaml模块为我们处理YAML文件提供了简单而强大的工具集。无论是解析YAML文件、序列化Python对象还是处理自定义对象,yaml模块都能够轻松胜任,并且提供了丰富的功能来提高文件的可读性和可重用性。 ### 回答3: Python中的yaml模块是一个用于处理和解析YAML格式的库。YAMLYAML Ain't Markup Language)是一种人类可读的数据序列化格式,它与XML和JSON类似,但更加简洁和易于阅读。 使用Python的yaml模块,我们可以将Python数据结构(如列表、字典等)转换为YAML格式的字符串,也可以将YAML格式的字符串解析为Python数据结构。这对于在不同的系统之间传递和存储数据非常有用,尤其是当数据需要保持其结构和层次关系时。 在Python中使用yaml模块非常简单。首先,我们需要导入yaml模块: ```python import yaml ``` 然后,我们可以使用load()函数将YAML格式的字符串解析为Python数据结构: ```python data = yaml.load(yaml_string) ``` 我们还可以使用dump()函数将Python数据结构转换为YAML格式的字符串: ```python yaml_string = yaml.dump(data) ``` 此外,yaml模块还提供了其他功能,如将YAML格式的数据写入文件或从文件中读取YAML数据。我们可以使用load_file()函数从文件中读取YAML数据,使用dump()函数将YAML数据写入文件。 总之,Python的yaml模块是一个非常方便和强大的工具,用于处理和解析YAML格式的数据。它可以帮助我们在不同的系统和编程语言之间轻松地传递和存储数据,并确保数据的结构和层次关系保持完整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值