几周来,我一直在学习python作为我的第一种编程语言。我决定用Numba编写一个乐透模拟。该代码在我的CPU上每秒大约250k次迭代时运行得很好。我真的很想看看它是如何在我的英伟达GPU上运行的,但我有点力不从心。如果有人能帮我一把,我将非常感激。
我的代码:
"""Importing the time module."""
import time
from numba import jit # Importing the jit function from the numba module.
import numpy as np # Importing the numpy library and giving it the alias np.
start = time.time()
@jit(nopython=True)
def lotto_numbers():
"""
The lotto_numbers function generates a list of six random numbers between 1 and 59.
The function will not return any number that is zero, but it will return any other number.
:return: A sorted numpy array of 6 unique random numbers between 1 and 59
"""
generated = False
while not generated:
balls = np.random.choice(59, 6, replace=False)
balls.sort()
temp = np.where(balls == 0)
(check_for_zero,) = temp[0].shape
if check_for_zero == 0:
generated = True
return balls
your_numbers = lotto_numbers() # optionally choose 6 numbers (not yet implemented).
print(f"Your numbers are: {your_numbers}")
MATCH = False
GAME_COUNT = 0
while not MATCH:
tmp = lotto_numbers()
cmp = tmp == your_numbers
GAME_COUNT += 1
if cmp.all():
MATCH = True
TOTAL_COST = GAME_COUNT * 2
TOTAL_WEEKS = GAME_COUNT / 2
TOTAL_YEARS = round(TOTAL_WEEKS / 52, 1)
end = time.time()
TOTAL_TIME = end - start
GPS = GAME_COUNT / TOTAL_TIME
print(f"\n\nWINNER!!! after {GAME_COUNT:,} games")
print(f"\ttotal cost: £{TOTAL_COST:,}")
print(f"\tweeks until win: {TOTAL_WEEKS:,} weeks")
print(f"\tyears until win: {TOTAL_YEARS:,} years")
print(
f"\texecution time: {TOTAL_TIME:,.2f} seconds at {GPS:,.2f} games per second")
if GAME_COUNT % 10000 == 0:
print(f"Game : {GAME_COUNT:,}", end='\r')
我知道我需要编写一个CUDA内核来针对GPU。我想我应该能够运行float16,因为数字并不复杂。此外,@vectorize似乎很重要。但是,老实说,我在踩水。