Summarize metadata fields for reporting and quality control
Source:R/summarize_metadata.R
summarize_metadata.RdCreate a publication-ready summary table for selected metadata fields. For each field, the summary reports the number of records, number of non-missing values, proportion missing, number of unique (non-missing) values, and the top-N most frequent values with counts.
Value
A data frame with one row per summarized field and the following columns:
- field
Field name.
- n
Total number of rows in
data.- n_non_missing
Number of non-missing values (
!is.na).- prop_missing
Proportion missing in [0, 1], rounded to 3 decimals.
- n_unique
Number of unique non-missing values.
- top_values
Top values formatted as
"value (count); ...".NAif no non-missing values.
Details
This helper is intended for quick dataset characterization (e.g., Methods, appendices, QC notes) after retrieving records with the package.
Examples
record <- search_finna("sibelius")
#> Warning: Default limit of 100 records is being used. Specify 'limit' argument for more records.
#> Total results found: 44905
#> Data fetching completed in 0.31 seconds.
overview <- summarize_metadata(
record,
fields = c("id", "Title","Author","Year", "Language", "Formats",
"Subjects", "Library", "Series", "last_indexed"))
overview
#> field n n_non_missing prop_missing n_unique
#> 1 id 100 100 0.00 100
#> 2 Title 100 100 0.00 4
#> 3 Author 100 100 0.00 49
#> 4 Year 100 98 0.02 40
#> 5 Language 100 89 0.11 7
#> 6 Formats 100 100 0.00 11
#> 7 Subjects 100 72 0.28 63
#> 8 Library 100 100 0.00 62
#> 9 Series 100 23 0.77 14
#> 10 last_indexed 100 0 1.00 0
#> top_values
#> 1 abo.kortkat_044103 (1); abo.kortkat_051809 (1); abo.kortkat_099113 (1)
#> 2 Sibelius (90); SIBELIUS (5); Sibelius. (4)
#> 3 Sibelius, Jean (17); Häyrynen, Antti (11); Lampila, Hannu-Ilari (7)
#> 4 1948 (8); 1997 (8); 1945 (6)
#> 5 fin (42); zxx (19); eng (12)
#> 6 Kirja, Kirja (52); Äänite, CD (17); Lehti/Artikkeli, Artikkeli (14)
#> 7 Sibelius, Jean. (6); Sibelius, Jean (2); Sibelius, Jean,, 1865-1957. (2)
#> 8 Kansalliskirjasto, Arto (20); Anders-kirjastot, Kokkola (5); Åbo Akademin kirjasto, Digitoitu korttikatalogi (–1981) (5)
#> 9 CD-sivut (4); Plus CD-lehti (4); Plus CD-sivut (3)
#> 10 <NA>