The cognitive hierarchy is divided into data from the bottom up 、 Information 、 knowledge 、 There are four levels of wisdom .
DIKW system :
data It is a word symbol image that has not been organized or processed , From facts . Data only represents itself , so to speak , Data is lonely , such as 165, data 165 On behalf of 165, Nothing else .
Information What is it? , When we organize or process data in some way , It needs to be analyzed , Let the data have a relationship , This can be used to answer some simple questions , Like who ? Where? ? Information is data arranged in order . such as ,165 Followed by cm It becomes height , Of course, what exactly , Also combine context .
knowledge It is the process of judging whether information is useful , This process combines context , interpretation , Reflect on . The more information is combined , The easier it is to judge right and wrong . For example, female height 165cm, This is through the interpretation of information , Come to a conclusion , This is an accumulation process .
wisdom , The ability to make correct judgments , And the correct use of knowledge , Wisdom can answer the question of why , Judge right and wrong 、 right and wrong 、 Good or bad , Focus on the future , Try to understand what you didn't understand in the past , It's unique to humans . We have female height 165cm Knowledge , When we want to buy a dress for our girlfriend , We will choose the right size , Instead of choosing a children's dress .
such as , Durian is data , Whether it tastes good or not is information , Taste the experience gained after direct practice , This is the process of knowledge generation , After eating, there is judgment, which is wisdom .
These are human cognitive development processes .
The relationship between information and data :
Data is the carrier of information , Is the expression of information . therefore , If a data is given meaning , It can be called information . Therefore, we can think that there is no information without data , Data is the source of information .[1]
Data is a record , It's the carrier , It's presentation , And the way is not limited to electronic ; Information is content , Is the connotation of data , Information is loaded on data , Make a meaningful interpretation of the data .
Information makes sense , And the data does not . therefore , We call the result of data processing information , It can be seen that the information is targeted 、 The characteristics of timeliness . Or you could say , The data is more concrete , And information is more abstract .
Data are raw facts , And information is the result of data processing .
Different knowledge 、 Experienced people , Understanding the same data , Different information is available .
As the inventor and user of words , Express and convey information through words , It's a very natural way . But in computers , But it can't express information through words , Because the machine cannot understand words .
A computer is made up of logic circuits , Logic circuits usually have only two physical states :“ open ” The state and “ close ” The state of . These two states can be used in numbers 1 and 0 To represent , Use 1 Indicates the open state ,0 Indicates the closed state .[2]
This uses numbers 1 and 0 To represent information , Called binary notation . So how to make people only understand 0 and 1 Your computer can read human words ? This requires making a dictionary for the computer , Translate words into computer language , It's different 0 and 1 Make a corresponding relationship between combination and text , This is computer coding .
For example, input in the computer “ I like you ”, The computer can't understand what you say , Because he can only read 0 and 1 These figures . So , Let's write a dictionary in the computer and tell it :
At this time, the computer will understand your meaning by referring to this dictionary , Oh , So what you said is 000 001 010 011.
In reality , People have done this work , Of course, this dictionary is used all over the world , Otherwise, the computer in one place cannot recognize the computer data in another place .
For computers 01 The combination represents a character , One byte can represent 2^8=255 Combinations of .
Work through this coding , Human words can be translated into binary data , So the computer can process these data .
To ensure that people and equipment , The correct exchange of information between the device and the computer , A unified code for information exchange , This is it. ASCII clock , Its full name is “ American standard code for information exchange ”.
Work through this coding , Human words can be translated into binary data , So the computer can process these data .
for example : Look up the table A Of ASCII The code value is (01000001)2=(41)16=65;
Because the computer was invented by foreigners , They use English , So we didn't consider the coding of Chinese characters , For use 0、1 The code string represents Chinese characters , China has formulated the information exchange code of Chinese characters GB2312-80, Abbreviation: national standard code .
Because there are many characters in Chinese characters , A byte of binary cannot contain all Chinese characters , Each symbol of all GB codes uses two bytes (16 Bit binary ) Code to represent .
There are characters in the national standard code 7445 individual . First level Chinese characters 3755 individual , In the order of Chinese Pinyin ; Secondary characters 3008 individual , Arrange by radicals and strokes .
With the popularity of computers , Each non English speaking country has its own set of codes , Never get the same , therefore Unicode emerge as the times require . All language codes are unified to Unicode In the code , In this way, there will be no problem of garbled code .
unicode code Both are two bytes , The remote one may use four bytes .
Coded storage :
If we need to encode the text in English , Then use unicode Two bytes of is not a waste , One byte is enough , But two bytes are used .
In order to save , There is utf-8 code , UTF-8 Code a Unicode Characters are encoded into... According to different number sizes 1-6 Bytes , Common English letters are encoded as 1 Bytes , Chinese characters Usually 3 Bytes , Only rare characters are encoded as 4-6 Bytes . If the text you are transferring contains a large number of English characters , use UTF-8 code Can save space .
Unicode and utf-8 The relationship between :
Unified use in memory unicode, When recording to the hard disk or editing text, it is converted into utf8.
Unicode Uniform character encoding , Give each character a unique code , Guarantee not to repeat .
UTF-8 take Unicode A compressed encoding method in which the encoded string is saved to the hard disk .
The bit ( position bit):
Bit is also called bit , Bit is also called bit
.32bit,64bit( Also known as 32 position ,64 position )1
byte (Byte) = 8 The bit ( position bit)
1
The bit ( position bit) = 1 individual 2 Base bit
byte (Byte):
( In programming , The most commonly used unit of analysis variables in memory layout is byte , therefore KB It won't use , The largest unit of measurement is trillion (MB),GB,TB).
1 The bit = 1 individual 2 Base bit
1 byte =8 The bit
therefore 1 byte = 8 Binary bits .
The basic unit of computer storage capacity :
The basic unit of computer storage capacity is bytes , also called Byte, In capital letters B Express . Both bits and bytes are units of storage capacity , Are used to indicate the storage capacity of the computer , But the two are different , Bit is the smallest unit of computer storage capacity ; Bytes are the basic unit of storage capacity in a computer .
What are bytes ?
byte , It 's an English word Byte Chinese translation of , It is a common unit of storage capacity in a computer , In a practical sense, it means a group 8 Binary bits . for instance , An English letter , Usually use 8 Encoded representation of bits , When storing , Say that this English word takes up one byte of storage space . And a Chinese character , It usually takes up two bytes of storage space .
Why is the basic unit of storage capacity bytes ?
From the definition analysis of bytes above , The English letters and 、 Chinese characters , Or other computer symbols , Are stored through binary encoding , If we use the number of binary coding bits to express the storage capacity , It will be very inconvenient , It's hard to remember . If these binary bits ,8 Divide into groups , Named byte , This is very convenient , And it's easier to remember , It's the same with ASCII The code corresponds to . Like letters a Take up one byte of storage space , Bi Shuo alphabet a Occupy 8 A binary bit or 8 A bit is better to remember more . This is why the basic unit of storage capacity in a computer is bytes .
So although the units of storage capacity in a computer are bytes and bits , But the basic unit of computer storage capacity is bytes , Not bits ; Bits only exist as the smallest unit of computer storage capacity .
Binary sum operation :
The decimal system we usually use is every 10 Into the 1, Similarly, binary is every 2 Into the 1.
Binary operation :
[1]: What's the difference between information and data ? - You know
[2]:Python Zero basics to mastery -3.1 section : Master the information representation of the computer - You know