Cha 5 - Names,notes,and labels
易错点注意:
若条件是“varname>某值”,则必须加一个非空条件:& !missing(varname)
变量名长度限制在10 characters内
重命名之后,要在label最后记录原变量名(原变量名)
Variable name
When planning names, anticipate new variables that could be added later.
Variable naming 也是要精心设计的。
1.1 The fundamental principle for creating and naming variables 5.6.1
Never change a variable unless you give it a new name.

(1)gen 和 clonevar

gen 和 clonevar 的区别
gen的新变量与original var有相同的描述性统计,但没有 value or variable labels
clonevar的新变量 is identical to original var

1.2 Planning names
If you are collecting your own data, you should plan names before the dataset is created.
1.3 Principles for selecting names
(1)Anticipate looking for variables

(2)Use simple, unambiguous names
Use names that are at most 12 characters long. P149
超过12个字符的,会被truncated.
重命名之后,要在label中记录原变量名。P149
二值变量的命名:以值为1的类别作为变量名。P150
多值(有方向的)变量命名:以scale的方向命名,P:positive,N:negative。 P150
(3)Recommendations for capital letters P150
(4)Try names before you decide
Selecting effective names and labels is an iterative process.
Labeling variables
Every variable should have a variable label.

以下这个labels尤其是
虚拟变量的label格式,值得借鉴。
主谓宾+?+“1=yes 0=no”
2.1 Listing variable labels and other information
There are many reasons why you might want a list of variables with their labels—to construct tables of descriptive statistics in a paper, to remind you of the names of variables as you plan your analyses, or to help you clean your data(file:wf5-varlabels.do).
(1)codebook, compact P184

(2)describe

The *'s indicate that there is a note associated with that variable
(3)nmlab: lists only variable names and labels

(4)tabulate: shows the varible label and the value labels

Although tabulate does not truncate long labels, longer labels are often more difficult to understand than shorter ones:

(5)Test labels before you post the file:codebook, compact and tabulate
If you do not like how the labels appear in the output,try different labels.Rerun the test commands and repeat the cycle until you are satisfied.
不满意当前的labels,就换,直到满意。
2.2 Changing the order of variables in your dataset: order, aorder, and move
Changing the order lets you put frequently used variables first to make them easier to click on in the Variables window.
aorder command arranges the variables in varlist aiphabetically. The syntax is
aorder [varlist]
If no varlist is given, all variables are alphabetized.
The order command allows you to move a group of variables to the front of the dataset:
order varlist
To move one variable,use the comrand
move variable-to-move target-variable
where variable-to-move is placed in front of the target-variable.
For many datasets, I run this pair of commands:
aorder
order id
where id is the name of the variable with the ID number.This arranges variables alphabetically, except that the ID variable appears first.The best way to learn how these commands work is to open a dataset,try the commands and watch how the list of variables in the Variables window changes.
2.3 Syntax for label variable
(1)Create labels
label variable assigns a text label of up to 80 characters to a variable.虽然可以写到80个字符,但是常常truncate labels超过30个字符的。
Put the most important information in the first 30 columns of a variable label.
The syntax is
label variable varname "label"
*可缩写为:
label var varname "xxx"
e.g.,
label var artsqrt "Square root of # of articles"
(2)Remove labels
label variable varname
*varname后面什么都不加,表示去除标签。
e.g.,
label var artsqrt
(3)Temporarily changing variable labels P190
- 当我不想tabulate的结果表中显示labels,而是直接显示变量名。
- 当我想修改labels in graphs, 因为
By default,the variable label is used to label the axes.
label variable varname
循环命令:
foreach varname in publ pub3 pub6 pub9{
label var `varname` ""
tabulate `varname`, missing
}
(4)★ Creating variable labels that include the variable name

unab varlist : _all
display "varlist is: " `varlist'"
foreach varname in `varlist'{
local varlabel : variable label `varname'
local var `varname' "`varname': `varlabel'"
}
*P191(159)
local varlabel: variable label lfpassigns local varlabel to the variable label for lfp
|
If I wanted to keep the new labels, I would save these in a new dataset.
也就是说,全部变量的labels格式被这样修改后,应该另存于一个dta中,与原数据相区分。 |
Adding notes to variables P192(160)

notes可以写得很长

3-1 notes varname 可叠加不覆盖

3-2 给notes加一个time stamp:TS

3-3 Commands for working with notes
(1)Listing notes
① To list all notes in a dataset
notes
② To list the notes for selected variables
notes list varlist
③ To list notes from start# to end#:
If you have multiple notes for a variable,they are numbered.
notes list variable-list in start-#/end-#
虽然这个命令看起来复杂,似乎有点高级,但实际上很鸡肋、不实用,因为我若查看一个varname的notes,必然是我不记得具体的notes了,基本上也不会记得某个notes的序号,所以干脆直接list全部:notes或notes list varname
④ list notes with codebook using the notes option
codebook varlist, notes
(2)Remove notes
notes drop evarlist [in #[/#]]
(3)Search notes
notes
Value labels【整理截止至PDF P200】
Categorical variables should have value labels unless the variable has an inherent metric.
除非变量具有固有度量,否则分类变量应具有值标签。
主要目的:tab 时能分辨哪个数字指代什么意思。
4.1 Creating value labels: a two-step process
Step 1: Defining labels
定义一组与value相关的标签,但没有指明哪个变量使用了这些标签。
//For yes/no questions with yes coded as I and no coded as 0,I could define the label as
label define yesno 1 yes 0 no
其他示例:

Step 2: Assigning labels
After labels are defined, label values assigns the defined labels to one or more variables.
//For example,because wc and he are yos/no questions,I can use the label definition yesno for both variables:
label values wc yesno
label values hc yesno
//Or, I can assign labels to both variables in one command:
label values wc hc yesno
Why a two-step system?
facilitates having consistent labels across variables and simplifies making changes to labels used by multiple variables.
同一个 label define 的value label 可以被多个变量使用,因此修改第一步的define,就能达到批量修改的目的。具体示例如下:
4.2 Remove labels
label values varlist [.]
//即:
label values varlist //use label values without specifying the label
//或
label values varlist .
注意:上例中虽然我已经从 wc 中删除了 yesno 标签,但标签定义并没有被删除,可以被其他变量使用。
4.3 Principles for constructing value labels
在创建value labels前就应该把它们计划好,计划中应确定哪些variables可以共享标签、missing values应如何写标签、标签的内容包括哪些。
(1)Keep labels short
Value labels should be eight or fewer characters in length
错误示例:
正确示例:
优点1:value labels的字符最大控制为9位,能保证完全显示。
优点2:把数字写在value labels的最前面,方便辨认标签指代的numeric value。【即,下一个Principle】
(2)Include the category number
Use value labels that include both a label and the value for each category as illustrated with the label sd_v2.【见上图正确示例:数字放标签的第一位】
Adding values to value labels P199(167)
如果我在一开始定义value标签时忘记include the values,我可以使用numlabelcommand来添加它们。
比如,一开始我是这样定义的:
label define defnot 1 Definite 2 Probably 3 ProbNot 4 DefNot
To add values to the front of the label, I use the command:
numlabel defnot,mask(#)add
//命令字面含义:给defnot定义的标签加上一个面具#
//syntax:
numlabel valuelabelname, add mask([#])
mask()的拓展应用:
Summary
Number of characters
Variable names are at most 12 characters long.
Variable labels will be truncated after 30 characters, although assigns a text label of up to 80 characters. Put the most important information in the first 30 columns.
Value labels should be 8 or fewer (at most 9, yes it's 9) characters in length

















浙公网安备 33010602011771号