程式師世界 >> 編程語言 >> JAVA編程 >> 關於JAVA >> 周全解析Java中的HashMap類

周全解析Java中的HashMap類

編輯：關於JAVA

周全解析Java中的HashMap類。本站提示廣大學習愛好者：（周全解析Java中的HashMap類）文章只能為提供參考，不一定能成為您想要的結果。以下是周全解析Java中的HashMap類正文

HashMap 和 HashSet 是 Java Collection Framework 的兩個主要成員，個中 HashMap 是 Map 接口的經常使用完成類，HashSet 是 Set 接口的經常使用完成類。固然 HashMap 和 HashSet 完成的接口標准分歧，但它們底層的 Hash 存儲機制完整一樣，乃至 HashSet 自己就采取 HashMap 來完成的。
現實上，HashSet 和 HashMap 之間有許多類似的地方，關於 HashSet 而言，體系采取 Hash 算法決議聚集元素的存儲地位，如許可以包管能疾速存、取聚集元素；關於 HashMap 而言，體系 key-value 當做一個全體停止處置，體系老是依據 Hash 算法來盤算 key-value 的存儲地位，如許可以包管能疾速存、取 Map 的 key-value 對。
在引見聚集存儲之前須要指出一點：固然聚集號稱存儲的是 Java 對象，但現實上其實不會真正將 Java 對象放入 Set 聚集中，只是在 Set 聚集中保存這些對象的援用而言。也就是說：Java 聚集現實上是多個援用變量所構成的聚集，這些援用變量指向現實的 Java 對象。

1、HashMap的根本特征

讀完JDK源碼HashMap.class中的正文部門，可以總結出許多HashMap的特征。

HashMap許可key與value都為null, 而Hashtable是不許可的。

HashMap是線程不平安的，而Hashtable是線程平安的

HashMap中的元素次序不是一向不變的，跟著時光的推移，統一元素的地位也能夠轉變（resize的情形）

遍歷HashMap的時光龐雜度與其的容量(capacity)和現有元素的個數（size）成反比。假如要包管遍歷的高效性，初始容量（capacity)不克不及設置太高或許均衡因子（load factor）不克不及設置太低。

與之前的相干List異樣，因為HashMap是線程不平安的, 是以迭代器在迭代進程中試圖做容器構造上的轉變的時刻，會發生fail-fast。經由過程Collections.synchronizedMap(HashMap)可以獲得一個同步的HashMap

2、Hash table 數據構造剖析

Hash table(散列表，哈希表),是依據症結字而直接拜訪內存存儲地位的數據構造。也就是說散列表樹立了症結字和存儲地址之間的一種直接映照

以下圖, key經由散列函數獲得buckets的一個索引地位。

經由過程散列函數獲得index弗成防止會湧現雷同的情形，也就是抵觸。上面簡略引見幾種處理抵觸的辦法：

Open addressing(開放定址法):此辦法的根本思惟就是碰到抵觸時，次序掃描表下N個地位，假如有余暇就填入。詳細算法不再解釋，上面是表示圖：

Separate chaining(拉鏈)：此辦法根本思惟就是碰到抵觸時，將雷同索引值的Entry用鏈表串起來。詳細算法不再解釋，上面是表示圖：

JDK中的HashMap處理抵觸的辦法就是用的Separate chaining法。

3、HashMap源碼剖析（JDK1.7）

1、HashMap讀寫元素

Entry
HashMap中的寄存的元素是Entry類型，上面給出源碼中Entry的源碼：

static class Entry<K,V> implements Map.Entry<K,V> {
 final K key;
 V value;
 Entry<K,V> next;
 int hash;
 Entry(int h, K k, V v, Entry<K,V> n) {
  value = v;
  next = n;
  key = k;
  hash = h;
 }
 //key, value的get與set辦法省略，get與set操作會在前面的迭代器頂用到
 ...
 public final boolean equals(Object o) {
  if (!(o instanceof Map.Entry))
  return false;
  Map.Entry e = (Map.Entry)o;
  Object k1 = getKey();
  Object k2 = e.getKey();
  if (k1 == k2 || (k1 != null && k1.equals(k2))) {
  Object v1 = getValue();
  Object v2 = e.getValue();
  if (v1 == v2 || (v1 != null && v1.equals(v2)))
   return true;
  }
  return false;
 }
 //此處將Key的hashcode與Value的hashcode做亦或運算獲得Entry的hashcode
 public final int hashCode() {
  return Objects.hashCode(getKey()) ^ Objects.hashCode(getValue());
 }
 public final String toString() {
  return getKey() + "=" + getValue();
 }
 /**
  * This method is invoked whenever the value in an entry is
  * overwritten by an invocation of put(k,v) for a key k that's already
  * in the HashMap.
  */
 void recordAccess(HashMap<K,V> m) {
 }
 /**
  * This method is invoked whenever the entry is
  * removed from the table.
  */
 void recordRemoval(HashMap<K,V> m) {
 }
 }

一個Entry包含key, value, hash和下一個Entry的援用，很顯著這是個單鏈表，其完成了Map.Entry接口。

recordAcess(HashMap<K, V> 與recordRemoval(HashMap<K, V>)在HashMap中是沒有任何詳細完成的。然則在LinkedHashMap這兩個辦法用來完成LRU算法。

get：讀元素
從HashMap中獲得響應的Entry，上面給出get相干源碼：

public V get(Object key) {
 //key是null的情形
 if (key == null)
  return getForNullKey();
 //依據key查找Entry
 Entry<K,V> entry = getEntry(key);
 return null == entry ? null : entry.getValue();
 }

getForNullKey源碼

private V getForNullKey() {
 if (size == 0) {
  return null;
 }
 //遍歷抵觸鏈
 for (Entry<K,V> e = table[0]; e != null; e = e.next) {
  if (e.key == null)
  return e.value;
 }
 return null;
 }

key為Null的Entry寄存在table[0]中，然則table[0]中的抵觸鏈中紛歧定存在key為null，是以須要遍歷。

依據key獲得entry：

final Entry<K,V> getEntry(Object key) {
 if (size == 0) {
  return null;
 }
 int hash = (key == null) ? 0 : hash(key);
 //經由過程hash獲得table中的索引地位，然後遍歷抵觸鏈表找到Key
 for (Entry<K,V> e = table[indexFor(hash, table.length)];
  e != null;
  e = e.next) {
  Object k;
  if (e.hash == hash &&
  ((k = e.key) == key || (key != null && key.equals(k))))
  return e;
 }
 return null;
 }

以上就是HashMap讀取一個Entry的進程及其源碼。時光龐雜度O(1)

put：寫元素
HashMap中put操作絕對龐雜，由於put操作的進程中會有HashMap的擴容操作。
新寫入一個元素，假如HashMap中存在要寫入元素的key，則履行的是調換value的操作，相當於update。上面是put源碼：

public V put(K key, V value) {
 //空表table的話，依據size的阈值填充
 if (table == EMPTY_TABLE) {
  inflateTable(threshold);
 }
 //填充key為Null的Entry
 if (key == null)
  return putForNullKey(value);
 //生成hash，獲得索引Index的映照
 int hash = hash(key);
 int i = indexFor(hash, table.length);
 //遍歷以後索引的抵觸鏈，找能否存在對應的key
 for (Entry<K,V> e = table[i]; e != null; e = e.next) {
  Object k;
  //假如存在對應的key， 則調換oldValue並前往oldValue
  if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
  V oldValue = e.value;
  e.value = value;
  e.recordAccess(this);
  return oldValue;
  }
 }
 //抵觸鏈中不存在新寫入的Entry的key
 modCount++;
 //拔出一個新的Entry
 addEntry(hash, key, value, i);
 return null;
 }

addEntry與createEntry源碼：

void addEntry(int hash, K key, V value, int bucketIndex) {
 //拔出新Entry前，先對以後HashMap的size和其阈值年夜小的斷定，選擇能否擴容
 if ((size >= threshold) && (null != table[bucketIndex])) {
  resize(2 * table.length);
  hash = (null != key) ? hash(key) : 0;
  bucketIndex = indexFor(hash, table.length);
 }
 createEntry(hash, key, value, bucketIndex);
 }
 void createEntry(int hash, K key, V value, int bucketIndex) {
 Entry<K,V> e = table[bucketIndex];
 //頭插法，新寫入的entry拔出以後索引地位的抵觸鏈第一個Entry的後面
 table[bucketIndex] = new Entry<>(hash, key, value, e);
 size++;
 }

以上就是HashMap寫入一個Entry的進程及其源碼。時光龐雜度O(1)

remove移除元素：

final Entry<K,V> removeEntryForKey(Object key) {
 if (size == 0) {
  return null;
 }
 //依據key盤算hash值，獲得索引
 int hash = (key == null) ? 0 : hash(key);
 int i = indexFor(hash, table.length);
 //鏈表的刪除，界說兩個指針，pre表現先驅
 Entry<K,V> prev = table[i];
 Entry<K,V> e = prev;
 //遍歷抵觸鏈，刪除一切為key的Enrty
 while (e != null) {
  Entry<K,V> next = e.next;
  Object k;
  //找到了
  if (e.hash == hash &&
  ((k = e.key) == key || (key != null && key.equals(k)))) {
  modCount++;
  size--;
  //找到第一個結點就是要刪除的結點
  if (prev == e)
   table[i] = next;
  else
   prev.next = next;
  e.recordRemoval(this);
  return e;
  }
  prev = e;
  e = next;
 }
 return e;
 }

以上就是HashMap刪除一個Entry的進程及其源碼。時光龐雜度O(1)

2、HashMap的哈希道理（hash function）

HashMap中散列函數的完成是經由過程hash(Object k) 與 indexFor(int h, int length)完成，上面看下源碼：

 final int hash(Object k) {
 int h = hashSeed;
 if (0 != h && k instanceof String) {
  return sun.misc.Hashing.stringHash32((String) k);
 }
 h ^= k.hashCode();
 // This function ensures that hashCodes that differ only by
 // constant multiples at each bit position have a bounded
 // number of collisions (approximately 8 at default load factor).
 //為了下降抵觸的概率
 h ^= (h >>> 20) ^ (h >>> 12);
 return h ^ (h >>> 7) ^ (h >>> 4);
 }

獲得Index索引源碼：

static int indexFor(int h, int length) {
 // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
 return h & (length-1);
 }

HashMap經由過程一個hash function將key映照到[0, table.length]的區間內的索引。如許的索引辦法年夜體有兩種：

hash(key) % table.length, 個中length必需為素數。JDK中HashTable應用此完成方法。
詳細應用素數的緣由，可以查找相干算法材料證實，這裡不再陳說。

hash(key) & (table.length - 1 ) 個中length必需為2指數次方。JDK中HashMap應用此完成方法。
由於length的年夜小為2指數次方倍，是以 hash(key) & (table.length - 1)總會在[0, length - 1]之間。然則僅僅如許做的話會湧現成績一個抵觸很年夜的成績，由於JAVA中hashCode的值為32位，當HashMap的容量偏小，例如16時，做異或運算時，高位老是被捨棄，低位運算後卻增長了抵觸產生的幾率。

是以為了下降抵觸產生的幾率，代碼中做了許多位運算和異或運算。

3、HashMap內存分派戰略

成員變量capacity與loadFactor
HashMap中請求容量Capacity是2的指數倍，默許容量是1 << 4 = 16。HashMap中還存在一個均衡因子（loadFactor），太高的因子會下降存儲空間然則查找（lookup，包含HashMap中的put與get辦法）的時光就會增長。 loadFactor默許值為0.75是衡量了時光龐雜度和空間龐雜度給出的最優值。

 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
 static final int MAXIMUM_CAPACITY = 1 << 30;
 static final float DEFAULT_LOAD_FACTOR = 0.75f;

HashMap的結構函數
HashMap的結構就是設置capacity，與loadFactor的初始值

public HashMap(int initialCapacity, float loadFactor) {
 if (initialCapacity < 0)
  throw new IllegalArgumentException("Illegal initial capacity: " +
      initialCapacity);
 if (initialCapacity > MAXIMUM_CAPACITY)
  initialCapacity = MAXIMUM_CAPACITY;
 if (loadFactor <= 0 || Float.isNaN(loadFactor))
  throw new IllegalArgumentException("Illegal load factor: " +
      loadFactor);
 this.loadFactor = loadFactor;
 threshold = initialCapacity;
 init();
 }

之前說過HashMap中capacity必需是2的指數倍，結構函數裡並沒無限制，那若何包管包管capacity的值是2的指數倍呢？
在put操作時刻，源碼中會斷定今朝的哈希表能否是空表，假如是則挪用inflateTable(int toSize)

private void inflateTable(int toSize) {
 // Find a power of 2 >= toSize
 int capacity = roundUpToPowerOf2(toSize);
 threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
 table = new Entry[capacity];
 initHashSeedAsNeeded(capacity);
 }

個中roundUpToPowerOf2就是獲得年夜於等於給定參數的最小的2的n次冪

private static int roundUpToPowerOf2(int number) {
 // assert number >= 0 : "number must be non-negative";
 return number >= MAXIMUM_CAPACITY
  ? MAXIMUM_CAPACITY
  : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
 }

Integer.hightestOneBit(int)是將給定參數的最高位的1保存，剩下的變成0的操作，簡略說就是將參數int變成小於等於它的最年夜的2的n次冪。

若number為2的n次冪，減1後最高位處於本來的次高位，再左移1位依然可以定位到最高位地位
若number不是2的n次冪，減1左移1位後最高位還是本來的最高位

擴容：
HashMap在put操作的時刻會產生resize行動，詳細源碼以下：

void resize(int newCapacity) {
 Entry[] oldTable = table;
 int oldCapacity = oldTable.length;
 //哈希表已到達最年夜容量，1 << 30
 if (oldCapacity == MAXIMUM_CAPACITY) {
  threshold = Integer.MAX_VALUE;
  return;
 }
 Entry[] newTable = new Entry[newCapacity];
 //將oldTable中的Entry轉移到newTable中
 //initHashSeedAsNeeded的前往值決議能否從新盤算hash值
 transfer(newTable, initHashSeedAsNeeded(newCapacity));
 table = newTable;
 //從新盤算threshold
 threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
 }
void transfer(Entry[] newTable, boolean rehash) {
 int newCapacity = newTable.length;
 //遍歷oldTable
 for (Entry<K,V> e : table) {
  //遍歷抵觸鏈
  while(null != e) {
  Entry<K,V> next = e.next;
  if (rehash) {
   //從新盤算hash值
   e.hash = null == e.key ? 0 : hash(e.key);
  }
  int i = indexFor(e.hash, newCapacity);
  //將元素拔出到頭部，頭插法
  e.next = newTable[i];
  newTable[i] = e;
  e = next;
  }
 }
 }

以上就是HashMap內存分派的全部進程，總結說來就是，hashMap在put一個Entry的時刻會檢討以後容量與threshold的年夜小來選擇能否擴容。每次擴容的年夜小是2 * table.length。在擴容時代會依據initHashSeedAsNeeded斷定能否須要從新盤算hash值。

4、HashMap的迭代器

HashMap中的ValueIterator， KeyIterator, EntryIterator等迭代器都是基於HashIterator的，上面看下它的源碼：

private abstract class HashIterator<E> implements Iterator<E> {
 Entry<K,V> next; // next entry to return
 int expectedModCount; // For fast-fail
 int index;  // current slot，table index
 Entry<K,V> current; // current entry
 HashIterator() {
  expectedModCount = modCount;
  //在哈希表中找到第一個Entry
  if (size > 0) { 
  Entry[] t = table;
  while (index < t.length && (next = t[index++]) == null)
   ;
  }
 }
 public final boolean hasNext() {
  return next != null;
 }
 final Entry<K,V> nextEntry() {
  //HashMap長短線程平安的，遍用時依然先斷定能否有表構造的修正
  if (modCount != expectedModCount)
  throw new ConcurrentModificationException();
  Entry<K,V> e = next;
  if (e == null)
  throw new NoSuchElementException();
  if ((next = e.next) == null) {
  //找到下一個Entry
  Entry[] t = table;
  while (index < t.length && (next = t[index++]) == null)
   ;
  }
  current = e;
  return e;
 }
 public void remove() {
  if (current == null)
  throw new IllegalStateException();
  if (modCount != expectedModCount)
  throw new ConcurrentModificationException();
  Object k = current.key;
  current = null;
  HashMap.this.removeEntryForKey(k);
  expectedModCount = modCount;
 }
 }

Key, Value, Entry這個三個迭代器停止封裝就釀成了keySet, values, entrySet三種聚集視角。這三種聚集視角都支撐對HashMap的remove, removeAll, clear操作，不支撐add， addAll操作。