redisforpython

發布時間: 2022-10-24 14:53:06

A. 如何高效地向Redis寫入大量的數據

具體實現步驟如下：
1.
新建一個文本文件，包含redis命令
SET
Key0
Value0
SET
Key1
Value1
...
SET
KeyN
ValueN
如果有了原始數據，其實構造這個文件並不難，譬如shell，python都可以
2.
將這些命令轉化成Redis
Protocol。
因為Redis管道功能支持的是Redis
Protocol，而不是直接的Redis命令。
如何轉化，可參考後面的腳本。
3.
利用管道插入
cat
data.txt
|
redis-cli
--pipe
Shell
VS
Redis
pipe
下面通過測試來具體看看Shell批量導入和Redis
pipe之間的效率。
測試思路：分別通過shell腳本和Redis
pipe向資料庫中插入10萬相同數據，查看各自所花費的時間。
Shell
腳本如下：
#!/bin/bash
for
((i=0;i<100000;i++))
do
echo
-en
"helloworld"
|
redis-cli
-x
set
name$i
>>redis.log
done
每次插入的值都是helloworld，但鍵不同，name0，name1...name99999。
Redis
pipe
Redis
pipe會稍微麻煩一點
1>
首先構造redis命令的文本文件
在這里，我選用了python
#!/usr/bin/python
for
i
in
range(100000):
print
'set
name'+str(i),'helloworld'
#
python
1.py
>
redis_commands.txt
#
head
-2
redis_commands.txt
set
name0
helloworld
set
name1
helloworld
2>
將這些命令轉化成Redis
Protocol
在這里，我利用了github上一個shell腳本，
#!/bin/bash
while
read
CMD;
do
#
each
command
begins
with
*{number
arguments
in
command}\r\n
XS=($CMD);
printf
"*${#XS[@]}\r\n"
#
for
each
argument,
we
append
${length}\r\n{argument}\r\n
for
X
in
$CMD;
do
printf
"\$${#X}\r\n$X\r\n";
done
done
<
redis_commands.txt
#
sh
20.sh
>
redis_data.txt
#
head
-7
redis_data.txt
*3
$3
set
$5
name0
$10
helloworld
至此，數據構造完畢。
測試結果

B. python爬蟲需要安裝哪些庫

一、請求庫

1. requests
requests 類庫是第三方庫，比 Python 自帶的 urllib 類庫使用方便和

2. selenium
利用它執行瀏覽器動作，模擬操作。
3. chromedriver
安裝chromedriver來驅動chrome。

4. aiohttp
aiohttp是非同步請求庫，抓取數據時可以提升效率。

二、解析庫
1. lxml
lxml是Python的一個解析庫，支持解析HTML和XML，支持XPath的解析方式，而且解析效率非常高。
2. beautifulsoup4
Beautiful Soup可以使用它更方便的從 HTML 文檔中提取數據。

3. pyquery
pyquery是一個網頁解析庫，採用類似jquery的語法來解析HTML文檔。
三、存儲庫
1. mysql
2. mongodb
3. redis
四、爬蟲框架scrapy
Scrapy 是一套非同步處理框架，純python實現的爬蟲框架，用來抓取網頁內容以及各種圖片
需要先安裝scrapy基本依賴庫，比如lxml、pyOpenSSL、Twisted

C. python 用redis做什麼功能

redis-py提供兩個類Redis和StrictRedis用於實現Redis的命令，StrictRedis用於實現大部分官方的命令，
並使用官方的語法和命令，Redis是StrictRedis的子類，用於向後兼容舊版本的redis-py。
import redis 導入redis模塊，通過python操作redis 也可以直接在redis主機的服務端操作緩存資料庫
r = redis.Redis(host='192.168.19.130', port=6379) host是redis主機，需要redis服務端和客戶端都起著 redis默認埠是6379
r.set('foo', 'Bar') key是"foo" value是"bar" 將鍵值對存入redis緩存
print r.get('foo') Bar 取出鍵foo對應的值！

D. 使用python同步mysql到redis由於數據較多，一條一條讀出來寫到redis太慢，有沒有可以批量操作的。

MYSQL快速同步數據到Redis
舉例場景：存儲游戲玩家的任務數據，游戲伺服器啟動時將mysql中玩家的數據同步到redis中。
從MySQL中將數據導入到Redis的Hash結構中。當然，最直接的做法就是遍歷MySQL數據，一條一條寫入到Redis中。這樣沒什麼錯，但是速度會非常慢。如果能夠想法使得MySQL的查詢輸出數據直接能夠與Redis命令行的輸入數據協議相吻合，可以節省很多消耗和縮短時間。
Mysql資料庫名稱為：GAME_DB, 表結構舉例：
CREATE TABLE TABLE_MISSION (
playerId int(11) unsigned NOT NULL,
missionList varchar(255) NOT NULL,
PRIMARY KEY (playerId)
);

Redis中的數據結構使用哈希表：
鍵KEY為mission, 哈希域為mysql中對應的playerId, 哈希值為mysql中對應的missionList。數據如下：
[root@iZ23zcsdouzZ ~]# redis-cli
127.0.0.1:6379> hget missions 36598
"{\"10001\":{\"status\":1,\"progress\":0},\"10002\":{\"status\":1,\"progress\":0},\"10003\":{\"status\":1,\"progress\":0},\"10004\":{\"status\":1,\"progress\":0}}"

快速同步方法：
新建一個後綴.sql文件：mysql2redis_mission.sql
內容如下：
SELECT CONCAT(
"*4\r\n",
'$', LENGTH(redis_cmd), '\r\n',
redis_cmd, '\r\n',
'$', LENGTH(redis_key), '\r\n',
redis_key, '\r\n',
'$', LENGTH(hkey), '\r\n',
hkey, '\r\n',
'$', LENGTH(hval), '\r\n',
hval, '\r'
)
FROM (
SELECT
'HSET' as redis_cmd,
'missions' AS redis_key,
playerId AS hkey,
missionList AS hval
FROM TABLE_MISSION
) AS t

創建shell腳本mysql2redis_mission.sh
內容：
mysql GAME_DB --skip-column-names --raw < mission.sql | redis-cli --pipe

Linux系統終端執行該shell腳本或者直接運行該系統命令，即可將mysql資料庫GAME_DB的表TABLE_MISSION數據同步到redis中鍵missions中去。mysql2redis_mission.sql文件就是將mysql數據的輸出數據格式和redis的輸入數據格式協議相匹配，從而大大縮短了同步時間。
經過測試，同樣一份數據通過單條取出修改數據格式同步寫入到redis消耗的時間為5min, 使用上面的sql文件和shell命令，同步完數據僅耗時3s左右。

E. python redis和cache的區別

簡單區別：
1. Redis中，並不是所有的數據都一直存儲在內存中的，這是和Memcached相比一個最大的區別。
2. Redis不僅僅支持簡單的k/v類型的數據，同時還提供list，set，hash等數據結構的存儲。
3. Redis支持數據的備份，即master-slave模式的數據備份。
4. Redis支持數據的持久化，可以將內存中的數據保持在磁碟中，重啟的時候可以再次載入進行使用。

Redis在很多方面具備資料庫的特徵，或者說就是一個資料庫系統，而Memcached只是簡單的K/V緩存

下面是來自redis作者的說法（stackoverflow上面）。
You should not care too much about performances. Redis is faster per core with small values, but memcached is able to use multiple cores with a single executable and TCP port without help from the client. Also memcached is faster with big values in the order of 100k. Redis recently improved a lot about big values (unstable branch) but still memcached is faster in this use case. The point here is: nor one or the other will likely going to be your bottleneck for the query-per-second they can deliver.
You should care about memory usage. For simple key-value pairs memcached is more memory efficient

F. python怎麼安裝redis

redis python-redis 安裝詳細步驟
安裝redis
把redis安裝到 /opt/redis-2.8目錄中

tar -zxfx redis-2.8.1.tar.gz
cd redis-2.8.1
make && make PREFIX=/opt/redis-2.8 install
cp redis.conf /opt/redis-2.8/
只是把redis當做隊列用，不需要存儲，所以編輯 /opt/redis-2.8/redis.conf
設置 daemonize yes
把3條 save .. 都注釋掉，這樣就關閉了硬碟存儲
啟動redis 非常簡單: /opt/redis-2.8/bin/redis-server /opt/redis-2.8/redis.conf
$REIDS_INSTALL_DIR/utils/redis_init_script 這個腳本稍做修改就可以放到/etc/init.d 作為redis啟動腳本用
安裝python
CentOS 自帶的python2.4，太舊了，升級到2.7

tar -zvxf Python-2.7.6.tgz
cd Python-2.7.6
./configure
make && make install
替換系統默認的python: sudo ln -s /usr/local/bin/python2.7 /usr/bin/python
安裝python的redis模塊
wget --no-check-certificate 2.8.0.tar.gz
tar -zvxf redis-2.8.0.tar.gz
mv redis-2.8.0 python-redis-2.8.0
cd python-redis-2.8.0
python setup.py install
部署成功

G. Python 常用的標准庫以及第三方庫有哪些

我也來幾個吧
standard libs:

itertools http://docs.python.org/2/library/itertools.html

functools http://docs.python.org/2/library/functools.html 學好python有必要掌握上面這兩個庫吧，
re 正則
subprocess http://docs.python.org/2/library/subprocess.html 調用shell命令的神器
pdb 調試
traceback 調試
pprint 漂亮的輸出
logging 日誌
threading和multiprocessing 多線程
urllib/urllib2/httplib http庫，httplib底層一點，推薦第三方的庫requests
os/sys 系統，環境相關
Queue 隊列
pickle/cPickle 序列化工具
hashlib md5, sha等hash演算法
cvs
json/simplejson python的json庫，據so上的討論和benchmark，simplejson的性能要高於json
timeit 計算代碼運行的時間等等
cProfile python性能測量模塊
glob 類似與listfile，可以用來查找文件
atexit 有一個注冊函數，可用於正好在腳本退出運行前執行一些代碼
dis python 反匯編，當對某條語句不理解原理時，可以用dis.dis 函數來查看代碼對應的python 解釋器指令等等。

3th libs:

paramiko https://github.com/paramiko/paramiko ssh python 庫
selenium https://pypi.python.org/pypi/selenium 瀏覽器自動化測試工具selenium的python 介面
lxml http://lxml.de/ python 解析html,xml 的神器
mechanize https://pypi.python.org/pypi/mechanize/ Stateful programmatic web browsing

pycurl https://pypi.python.org/pypi/pycurl cURL library mole for Python
Fabric http://docs.fabfile.org/en/1.8/
Fabric is a Python (2.5 or higher) library and command-line tool for
streamlining the use of SSH for application deployment or systems
administration tasks.

xmltodict https://github.com/martinblech/xmltodict xml 轉 dict，真心好用
urllib3 和 requests: 當然其實requests就夠了 Requests: HTTP for Humans
flask http://flask.pocoo.org/python web 微框架
ipdb 調試神器，同時推薦ipython！結合ipython使用
redis redis python介面
pymongo mongodbpython介面
PIL http://www.pythonware.com/procts/pil/ python圖像處理
mako http://www.makotemplates.org/ python模版引擎
numpy ， scipy 科學計算
matplotlib 畫圖

scrapy 爬蟲
django/tornado/web.py/web2py/uliweb/flask/twisted/bottle/cherrypy.等等 python web框架/伺服器
sh 1.08 — sh v1.08 documentation 用來運行shell 模塊的極佳選擇

暫時記得這么多吧，不過都是我自己常用的庫 :) 。。歡迎補充

UPDATE:
A curated list of awesome Python frameworks, libraries and software.

vinta/awesome-python · GitHub

幾乎所有很贊的 python 庫，和框架都在這個列表裡。

其他的 awesome list：
bayandin/awesome-awesomeness · GitHub

H. 如何高效地向Redis寫入大量的數據

具體實現步驟如下：
1. 新建一個文本文件，包含redis命令
SET Key0 Value0
SET Key1 Value1
...
SET KeyN ValueN
如果有了原始數據，其實構造這個文件並不難，譬如shell，python都可以
2. 將這些命令轉化成Redis Protocol。
因為Redis管道功能支持的是Redis Protocol，而不是直接的Redis命令。
如何轉化，可參考後面的腳本。
3. 利用管道插入
cat data.txt | redis-cli --pipe
Shell VS Redis pipe
下面通過測試來具體看看Shell批量導入和Redis pipe之間的效率。
測試思路：分別通過shell腳本和Redis pipe向資料庫中插入10萬相同數據，查看各自所花費的時間。
Shell
腳本如下：
#!/bin/bash
for ((i=0;i<100000;i++))
do
echo -en "helloworld" | redis-cli -x set name$i >>redis.log
done
每次插入的值都是helloworld，但鍵不同，name0，name1...name99999。
Redis pipe
Redis pipe會稍微麻煩一點
1> 首先構造redis命令的文本文件
在這里，我選用了python
#!/usr/bin/python
for i in range(100000):
print 'set name'+str(i),'helloworld'
# python 1.py > redis_commands.txt
# head -2 redis_commands.txt
set name0 helloworld
set name1 helloworld
2> 將這些命令轉化成Redis Protocol
在這里，我利用了github上一個shell腳本，
#!/bin/bash
while read CMD; do
# each command begins with *{number arguments in command}\r\n
XS=($CMD); printf "*${#XS[@]}\r\n"
# for each argument, we append ${length}\r\n{argument}\r\n
for X in $CMD; do printf "\$${#X}\r\n$X\r\n"; done
done < redis_commands.txt
# sh 20.sh > redis_data.txt
# head -7 redis_data.txt
*3
$3
set
$5
name0
$10
helloworld
至此，數據構造完畢。
測試結果

I. python redis連接線程安全么

在ConnectionPool之前，如果需要連接redis，我都是用StrictRedis這個類，在源碼中可以看到這個類的具體解釋：

redis.StrictRedis Implementation of the Redis protocol.This abstract class provides a Python interface to all Redis commands and an
implementation of the Redis protocol.Connection and Pipeline derive from this, implementing how the commands are sent and received to the Redis server
使用的方法：

?

1
2

r=redis.StrictRedis(host=xxxx, port=xxxx, db=xxxx)
r.xxxx()

有了ConnectionPool這個類之後，可以使用如下方法

?

1
2

pool = redis.ConnectionPool(host=xxx, port=xxx, db=xxxx)
r = redis.Redis(connection_pool=pool)

這里Redis是StrictRedis的子類
簡單分析如下：
在StrictRedis類的__init__方法中，可以初始化connection_pool這個參數，其對應的是一個ConnectionPool的對象：

?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

class StrictRedis(object):
........
def __init__(self, host='localhost', port=6379,
db=0, password=None, socket_timeout=None,
socket_connect_timeout=None,
socket_keepalive=None, socket_keepalive_options=None,
connection_pool=None, unix_socket_path=None,
encoding='utf-8', encoding_errors='strict',
charset=None, errors=None,
decode_responses=False, retry_on_timeout=False,
ssl=False, ssl_keyfile=None, ssl_certfile=None,
ssl_cert_reqs=None, ssl_ca_certs=None):
if not connection_pool:
..........
connection_pool = ConnectionPool(**kwargs)
self.connection_pool = connection_pool

在StrictRedis的實例執行具體的命令時會調用execute_command方法，這里可以看到具體實現是從連接池中獲取一個具體的連接，然後執行命令，完成後釋放連接：

?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

# COMMAND EXECUTION AND PROTOCOL PARSING
def execute_command(self, *args, **options):
"Execute a command and return a parsed response"
pool = self.connection_pool
command_name = args[0]
connection = pool.get_connection(command_name, **options) #調用ConnectionPool.get_connection方法獲取一個連接
try:
connection.send_command(*args) #命令執行，這里為Connection.send_command
return self.parse_response(connection, command_name, **options)
except (ConnectionError, TimeoutError) as e:
connection.disconnect()
if not connection.retry_on_timeout and isinstance(e, TimeoutError):
raise
connection.send_command(*args)
return self.parse_response(connection, command_name, **options)
finally:
pool.release(connection) #調用ConnectionPool.release釋放連接

在來看看ConnectionPool類：
?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

class ConnectionPool(object):
...........
def __init__(self, connection_class=Connection, max_connections=None,
**connection_kwargs): #類初始化時調用構造函數
max_connections = max_connections or 2 ** 31
if not isinstance(max_connections, (int, long)) or max_connections < 0: #判斷輸入的max_connections是否合法
raise ValueError('"max_connections" must be a positive integer')
self.connection_class = connection_class #設置對應的參數
self.connection_kwargs = connection_kwargs
self.max_connections = max_connections
self.reset() #初始化ConnectionPool 時的reset操作
def reset(self):
self.pid = os.getpid()
self._created_connections = 0 #已經創建的連接的計數器
self._available_connections = [] #聲明一個空的數組，用來存放可用的連接
self._in_use_connections = set() #聲明一個空的集合，用來存放已經在用的連接
self._check_lock = threading.Lock()
.......
def get_connection(self, command_name, *keys, **options): #在連接池中獲取連接的方法
"Get a connection from the pool"
self._checkpid()
try:
connection = self._available_connections.pop() #獲取並刪除代表連接的元素，在第一次獲取connectiong時，因為_available_connections是一個空的數組，
會直接調用make_connection方法
except IndexError:
connection = self.make_connection()
self._in_use_connections.add(connection) #向代表正在使用的連接的集合中添加元素
return connection
def make_connection(self): #在_available_connections數組為空時獲取連接調用的方法
"Create a new connection"
if self._created_connections >= self.max_connections: #判斷創建的連接是否已經達到最大限制，max_connections可以通過參數初始化
raise ConnectionError("Too many connections")
self._created_connections += 1 #把代表已經創建的連接的數值+1
return self.connection_class(**self.connection_kwargs) #返回有效的連接，默認為Connection(**self.connection_kwargs)
def release(self, connection): #釋放連接，鏈接並沒有斷開，只是存在鏈接池中
"Releases the connection back to the pool"
self._checkpid()
if connection.pid != self.pid:
return
self._in_use_connections.remove(connection) #從集合中刪除元素
self._available_connections.append(connection) #並添加到_available_connections 的數組中
def disconnect(self): #斷開所有連接池中的鏈接
"Disconnects all connections in the pool"
all_conns = chain(self._available_connections,
self._in_use_connections)
for connection in all_conns:
connection.disconnect()

execute_command最終調用的是Connection.send_command方法，關閉鏈接為 Connection.disconnect方法，而Connection類的實現：

?

1
2
3
4
5
6
7

class Connection(object):
"Manages TCP communication to and from a Redis server"
def __del__(self): #對象刪除時的操作，調用disconnect釋放連接
try:
self.disconnect()
except Exception:
pass

核心的鏈接建立方法是通過socket模塊實現：

?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

def _connect(self):
err = None
for res in socket.getaddrinfo(self.host, self.port, 0,
socket.SOCK_STREAM):
family, socktype, proto, canonname, socket_address = res
sock = None
try:
sock = socket.socket(family, socktype, proto)
# TCP_NODELAY
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
# TCP_KEEPALIVE
if self.socket_keepalive: #構造函數中默認 socket_keepalive=False，因此這里默認為短連接
sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
for k, v in iteritems(self.socket_keepalive_options):
sock.setsockopt(socket.SOL_TCP, k, v)
# set the socket_connect_timeout before we connect
sock.settimeout(self.socket_connect_timeout) #構造函數中默認socket_connect_timeout=None，即連接為blocking的模式
# connect
sock.connect(socket_address)
# set the socket_timeout now that we're connected
sock.settimeout(self.socket_timeout) #構造函數中默認socket_timeout=None
return sock
except socket.error as _:
err = _
if sock is not None:
sock.close()
.....

關閉鏈接的方法：

?

1
2
3
4
5
6
7
8
9
10
11

def disconnect(self):
"Disconnects from the Redis server"
self._parser.on_disconnect()
if self._sock is None:
return
try:
self._sock.shutdown(socket.SHUT_RDWR) #先shutdown再close
self._sock.close()
except socket.error:
pass
self._sock = None

可以小結如下
1）默認情況下每創建一個Redis實例都會構造出一個ConnectionPool實例，每一次訪問redis都會從這個連接池得到一個連接，操作完成後會把該連接放回連接池（連接並沒有釋放）,可以構造一個統一的ConnectionPool，在創建Redis實例時，可以將該ConnectionPool傳入，那麼後續的操作會從給定的ConnectionPool獲得連接，不會再重復創建ConnectionPool。
2）默認情況下沒有設置keepalive和timeout，建立的連接是blocking模式的短連接。
3）不考慮底層tcp的情況下，連接池中的連接會在ConnectionPool.disconnect中統一銷毀。

閱讀全文

熱點內容

java返回this 發布：2025-10-20 08:28:16 瀏覽：593

製作腳本網站發布：2025-10-20 08:17:34 瀏覽：888

python中的init方法發布：2025-10-20 08:17:33 瀏覽：582

圖案密碼什麼意思發布：2025-10-20 08:16:56 瀏覽：765

怎麼清理微信視頻緩存發布：2025-10-20 08:12:37 瀏覽：684

c語言編譯器怎麼看執行過程發布：2025-10-20 08:00:32 瀏覽：1013

郵箱如何填寫發信伺服器發布：2025-10-20 07:45:27 瀏覽：255

shell腳本入門案例發布：2025-10-20 07:44:45 瀏覽：114

怎麼上傳照片瀏覽上傳發布：2025-10-20 07:44:03 瀏覽：806

python股票數據獲取發布：2025-10-20 07:39:44 瀏覽：713

redisforpython

與redisforpython相關的資訊